ABOUT DATA SCIENCE
ALL YOU NEED TO KNOW ABOUT DATA SCIENCE:
Data science is an interdisciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from many structural and unstructured data. Data science is related to data mining and big data.
Everyone will tell you that you need to comment your code. You do it for yourself, for others, it might help you to put down a structure of your code before you get down to coding properly. Writing a lot of comments might give you a false sense of confidence, that you are doing a good job. While in reality, you are commenting on your code a lot with obvious, redundant statements that are not bringing any value. The role of a comment it to explain, not to describe. You need to realize that any piece of comment has to add information to the code you already have, not to double it.
Keep in mind, you are not narrating the code, adding ‘subtitles’ to python’s performance. The comments are there to clarify what is not explicit in the code itself. Adding a comment saying what the line of code does is completely redundant most of the time:
A good rule of thumb would be: if it starts to sound like an instastory, rethink it. ‘So, I am having my breakfast, with a chai latte and my friend, the cat is here as well’. No.
It is also a good thing to learn to always update necessary comments before you modify the code. It is incredibly easy to modify a line of code, move on, and forget the comment. There are people who claim that there are very few crimes in the world worse than comments that contradict the code itself.
Of course, there are situations, where you might be preparing a tutorial for others and you want to narrate what the code is doing. Then writing that load function will load the data is good. It does not have to be obvious for the listener. When teaching, repetitions, and overly explicit explanations are more than welcome. Always have in mind who your reader will be.
Study: What Skills Do You Need to Become a Data Scientist in 2020?
If you have recently browsed job boards and considered potential career paths, you probably know that the “data scientist” position is one of the most exciting opportunities in the market. Many people find this profession attractive, as it comes with several key advantages: salary, career opportunities, personal development, intellectual challenges, and multiple companies competing for few available candidates.
This sounds great, but how does a person become a qualified applicant for a data scientist position?
To understand this better, we conducted our “1,001 data scientist LinkedIn profiles” research for the third year in a row. Now, we are not only able to delineate the typical traits of data science professionals in 2020, but we can also compare this data with the 2018 and 2019 figures.
At a Glance
In 2020, our study portrays a data scientist’s collective image as a male (71%) who is bilingual and has been in the workforce for 8.5 years (3.5 years of which have been as a data scientist). He or she works with Python and/or R and has a Master’s degree.
For the third year in a row, the majority of data scientists in the research is male. The proportion remained very stable -- 70%-30% in 2018, 69%-31% in 2019, and 71%-29% in 2020 -- and is likely a true representation of the actual situation in the workplace. Nevertheless, the expectation is that, as the profession becomes more established, the gender gap should reduce in the years to come.
What Professional Experience Do Data Scientists Have?
If we have to delineate three programming languages data scientists most frequently use in 2020, these would be Python (73%), R (56%), and SQL (51%). Besides the top 3, some other popular coding languages are MATLAB (20%), and Java (16%). In fact, this insight is the most actionable of the whole study. It gives people interested in a data science career clear evidence on where to focus.
At the moment, Python appears to be the data scientist’s preferred tool for data processing and problem-solving. Three years ago, things looked a bit different. In 2018, Python and R had the same level of adoption, according to our research (53%). Then, the tide turned in 2019 when Python came in the lead with 54% vs 45%, and now, we can see that Python established itself as the industry’s coding language of choice, with a significant lead over R.
Another interesting finding in 2020 is that fewer data scientists are in their first year on the job (13%) compared to previous periods (25% in 2018 and 2019). A few years ago, as data science had just emerged, companies were recruiting professionals with different backgrounds and training them in-house. As a result, in some cases, relatively junior candidates were hired for senior data scientist roles. Our numbers show that, as more people gain experience in the field, first-year data scientists account for a smaller portion of the total.
The idea that experience plays a bigger role in recruiting is reinforced by the finding that the average data scientist professional in 2020 has been in the workforce for 8.5 years, while our 2018 data showed an average working experience of 4.5 years.
Therefore, in today’s job market, one needs to accumulate the necessary working experience in an analytical position before they are ready for a data scientist job title.
Our study examined data scientists’ previous job occupation 1 and 2 jobs ago. Two positions prior to their current role, the average data scientist in our sample were either already a Data Scientist (29%), an Analyst (17%), or in Academia (12%). The figures change (but the roles stay the same) when we look at the positions our cohort occupied immediately before entering their current role: 52% for Data Scientists, 11% for Analysts, and 8% for Academia.


Comments
Post a Comment