• <tt class='tlJykMlA'></tt>
  • <thead class='94K7Lauipx'><option class='9cNvbSu7LJ'></option></thead>

    <em class='jepytdnyfeW3'><b class='oR1Ys12nk'><td class='6rzzLTN'></td></b></em>

  • <dl class='Td0EmkF'><b class='qoWhaibfJ9'></b></dl>

  • <span class='lS1G'></span>


    KDnuggets 500彩票下载app二维码 » News » 2020 » Jan » Tutorials, Overviews » Top 5 must-have Data Science skills for 2020 ( 20:n02 )

    Gold BlogTop 5 must-have Data Science skills for 2020


    The standard job description for a Data Scientist has long highlighted skills in R, Python, SQL, and Machine Learning. With the field evolving, these core competencies are no longer enough to stay competitive in the job market.

    By , Data Scientist, Disneyland Paris.

    Update you500彩票下载app二维码r skills for the 2020 data job market!

    Data Science is a competitive field, and people are quickly building more and more skills and experience. This has given rise to the booming job description of Machine Learning Engineer, and therefore, my advice for 2020 is that all Data Scientists need to be developers as well.

    To stay competitive, make sure to prepare you500彩票下载app二维码rself for new ways of working that come with new tools.


    1. Agile

    Agile is a method of organizing work that is already much used by dev teams. Data Science roles are filled more and more by people who’s original skillset is pure software development, and this gives rise to the role of Machine Learning Engineer.

    Post-its and Agile seem to go hand-in-hand.

    More and more, Data Scientists/Machine Learning Engineers are managed as developers: continuously making improvements to Machine Learning elements in an existing codebase.

    For this type of role, Data Scientists have to know the Agile way of working based on the Scrum method. It defines several roles for different people, and this role definition makes sure that continuous improvement and be implemented smoothly.


    2. Github

    Git and Github are software for developers that are of great help when managing different versions of software. They track all changes that are made to a code base, and in addition, they add real ease in collaboration when multiple developers make changes to the same project at the same time.

    GitHub is the way to go.

    With the role of Data Scientist becoming more dev-heavy, it becomes key to be able to handle those dev tools. Git is becoming a serious job requirement, and it takes time to get used to best practices for using Git. It is easy to start working on Git when you500彩票下载app二维码’re alone or when you500彩票下载app二维码r co-works are new, but when you500彩票下载app二维码 join a team with Git experts and you500彩票下载app二维码’re still a newbie, you500彩票下载app二维码 might struggle more than you500彩票下载app二维码 think.

    Git is the real skill to know for GitHub.


    3. Industrialization

    What is also changing in Data Science is the way we think about our projects. The Data Scientist is still the person who answers business questions with machine learning, as it has always been. But Data Science projects are more and more often developed for production systems, for example, as a micro-service in a larger software.

    AWS is the biggest Cloud Vendor.

    At the same time, advanced types of models are getting more and more CPU and RAM intensive to execute, especially when working with Neural Networks and Deep Learning.

    In terms of job descriptions of a Data Scientist, it is becoming more important to not only think about the accuracy of you500彩票下载app二维码r model but also take into account the time of execution or other industrialization aspects of you500彩票下载app二维码r project.

    Google also has a cloud service, just like Microsoft (Azure).


    4. Cloud and Big Data

    While industrialization of Machine Learning is becoming a more serious constraint for Data Scientists, it has also become a serious constraint for Data Engineers and IT in general.

    A famous comic (source: ).

    Where the Data Scientist can work on reducing the time needed by a model, the IT people can contribute by changing to faster compute services that are generally obtained in one or both of the following:

    • Cloud: moving compute resources to external vendors like AWS, Microsoft Azure, or Google Cloud makes it very easy to set up a very fast Machine Learning environment that can be accessed from a distance. This asks from Data Scientists to have a basic understanding of Cloud functioning, for example: working with servers at distance instead of you500彩票下载app二维码r computer, or working on Linux rather than on Windows / Mac.

    PySpark is writing Python for parallel (Big Data) systems.

    • Big Data: a second aspect of faster IT is using Hadoop and Spark, which are tools that allow for the parallelization of tasks on many computers at the same time (worker nodes). This asks for using a different approach to implementing models as a Data Scientist because you500彩票下载app二维码r code must allow for parallel execution.


    5. NLP, Neural Networks, and Deep Learning

    Recently, it has still been accepted for a Data Scientist to consider that NLP and image recognition as mere specializations of Data Science that not all have to master.

    You will need to understand Deep Learning: Machine Learning based on the idea of the human brain.

    But the use cases for image classification and NLP get more and more frequent even in ‘regular’ business. At current times, it has become unacceptable to not have at least basic knowledge of such models.

    Even if you500彩票下载app二维码 do not have direct applications of such models in you500彩票下载app二维码r job, a hands-on project is easy to find and will allow you500彩票下载app二维码 to understand the steps needed in image and text projects.

    . Reposted with permission.

    Bio: is a data scientist at Disneyland Paris with a strong focus on Machine Learning using R, Python, and SQL on a daily basis. Joos holds an MSc degree in applied data science and official certifications for AWS and SAS.


    Sign Up

    By subscribing you500彩票下载app二维码 accept KDnuggets Privacy Policy


  • <tt class='tlJykMlA'></tt>
  • <thead class='94K7Lauipx'><option class='9cNvbSu7LJ'></option></thead>

    <em class='jepytdnyfeW3'><b class='oR1Ys12nk'><td class='6rzzLTN'></td></b></em>

  • <dl class='Td0EmkF'><b class='qoWhaibfJ9'></b></dl>

  • <span class='lS1G'></span>