Tuesday, September 22, 2015

Data Scientists Skills

I have analysed a dataset of 974 LinkedIn job advertisements for data scientists, based in the US. The skills listed in the dataset are classified as "cloud_software_required","database_software_required", "statistic_software_required", and "programming_language_required".
 The most frequent skills in the dataset are the following:

The numbers represent the occurrences of each skill in the dataset. However, this does not tell anything about the associations between these skills, so I have used the a priori algorithm to find the association rules, using a minimum confidence of 70%:
This means, for instance, that in at least 70% of the job descriptions, whenever R was required, Python was required too. Data source: http://www.crowdflower.com/data-for-everyone

