RICHMOND OKEZIE

Aug 6, 20207 min

A beginner's guide to the Data Science profession in Nigeria; how to start a career in Data Science.

Data science cool things.

Who are Data Scientists?

Data Scientists are people with some mix of coding and statistical skills who work on making data useful in various ways. These guys are presently at the top of the job chain (google it). Nigeria & the world needs more of them.

At the start of pandemic, Lagos State Governor noted that one of the major challenges the country was facing in combating the pandemic was lack of data, reliable data about the pandemic; countries who have combatted the pandemic effectively did so with the help of reliable and accurate data that was collected, cleaned, analysed and used in decision making. That’s how Data Scientists do.

This is not only in government, businesses alike run on data. Data is generated based on your favorite Netflix and CableTV show, the type of vehicle you choose on Uber, the items you shop the most in your favourite store. This is called user generated data and is used by e-commerce and consumer companies to generate products and services that match the customer’s expectations and preference. This is what Data Scientist do.

But Data Scientists go beyond this. Some build and manage large server farms for big companies like Google, Amazon, Facebook, etc; some, using predictive techniques like regression and other models, build machine learning algorithms that power autonomous machines, robots, self-driving cars, and drones; some build algorithms that help systems integrate in internet of things; some build and manage databases; some build and manage data on cloud technologies.

These are many possibilities in Data Science.

What types of Data Scientists do we have?

For the purpose of explaining this in very basic terms, I will split the profession into two, Type A Data Scientists and Type B Data Scientists.

Type A Data Scientists

The A here is for analysis. This type of Data Scientists are primarily concerned with making sense of data or working with it in a fairly static way like cleaning the data, dealing with large data sets, visualization, deep knowledge about a particular domain, reporting the data, and so on. The guys here can code well enough to work with data (consider writing formulas in Excel to manipulate a data set), but they are not necessarily experts in coding. Some guys cannot code, but they are still excellent analysts. They are Business Analysts, Data Analysts, Database Designers and Administrators, AWS/Azure Data experts, etc.

What tools and skills are used by Type A Data Scientists?

- Data Analysis:

This is the field that is concerned with collecting raw data and transforming it into something that can be used to make better business decisions. Data Analysts collect, process, and analyse and present data to make better business decisions. The skills are tools used at the different stages are listed below;

Skills: R, Python, Statistics

Tools: SAS, Jupyter, R Studio, MATLAB, Excel, RapidMiner, SQL

- Data Visualization:

This is an extension of Data Anlysis. This is the representation of information in the form of chart, diagram, pictures, etc, to help decision making. When data is visualized properly, it is easily understood and grasped. Consider the following skills and tools used.

Skills: R, Python Libraries

Tools: Jupyter, Tableau, Cognos, RAW, Microsoft PowerBI, QlikView, Plotly

- Data Warehousing, Database & Big Data:

The guys here collect and build and manage different database systems, data types, data sources, etc, to help systems operate well and make better business decisions. Data warehousing for instance is a form of database that is designed for query and analysis, and it contains various heterogeneous types of data from multiple sources. Big data on the other hand refers to data that is voluminous, has variety and is collected rapidly – think of Data generated every second in some form or the other, be it from the transactions we make, from our images and videos, from our surveys, from credit cards or debit cards or sensors all are collectively called big data.

Big data has no use on its own. It is often used in connection with other use cases such as reporting or data modelling, IOT, banking, etc.

Skills: ETL, SQL, Hadoop, Apache Spark

Tools: Informatica/Talend, AWS Redshift

Type A is where a lot of people start, build, then move to type B.

Type B Data Scientists

The B here stands for building. Type B Data Scientists share some statistical background with Type A, but they are also very strong coders and developers and may be trained software engineers. These guys are mainly interested in using data “in production”. They build models which interact with users, often serving recommendations (products people may know, ads, movies, search results; they also build algorithms that learn and adapt. They are Data Engineers, Machine Learning Engineers, Artificial Intelligence Specialists.

What tools and skills are used by Type B Data Scientists?

- Machine Learning:

This is field that blends programming with science. It involves teaching computers to take decisions and complete actions based on data, without explicitly programming them to do so. Think of Siri, Amazon’s assistant Alexa, or Google Assistant. These applications help you complete actions using your voice input. Machine learning is an integral part of these applications as they collect, refine, and implement data based on the user’s previous interaction with them. The same application is seen in facial recognition, fingerprint scanning, etc. To kickstart your career here, you must acquaint yourself with the three basic building blocks; Linear algebra, probability and statistics. Consider the following skills and tools.

Skills: Algebra, Machine Learning Algorithms, Statistics, Linear & Logistic Regression

Tools: Spark Mlib, Mahout, Azure ML Studio

- Data Engineering:

Data Engineers often work with other Data Scientists to build end-to-end solutions for companies using statistics and machine learning models. They build data architectures for ingestion, processing, and surfacing data for Large-scale data-intensive applications. Think of the Google search engine algorithm, or the Instagram or Facebook algorithm; Data Engineers are at the centre of such complex and large-scale applications.

Skills: Deployment, ML Algorithms, Software Engineering, Advanced SQL, Data model,

Tools: AWS, Azure, Apache Spark, Kafka, Zookeeper, YARN, Airflow, etc.

Many Data Scientists are a mix of Type A & B. This distinction was majorly to help you understand the different categories and fields.

Fulfilling the prerequisite

Education:

If you can get an education in Data Science, undergraduate or post graduate degree, or even a diploma, then you should, this will go a long way to help you kick start your career. Again, Nigerian Universities are still very far behind in this field, so your best bet is a program from a University abroad, a lot can be taken solely online, and they are relatively affordable too. There are reputable online data science programs you can take. The modules covered and the structured learning will help you a lot. But if this is not for you, then check the next option;

Online Learning:

There are new online platforms now that offer training on data science. Datacamp is one of them. They have different options, you could check out these platforms, compare the programs and prices, and decide which to take. Most of the programs are for 3 months and have an option of jobs attached to them, that is, connecting you with employers at the end of your training, Like Cybary and Lamba. Check out these platforms;

  • Data Quest: Online data science learning platform that offers courses in different areas like Data Analysts in R, Data Analyst in Python, Data Scientist in Python, Data Engineer. This is not free, it is a paid platform.

  • Datacamp - Learn Data Science Online: The skills people and businesses need to succeed are changing. No matter where you are in your career or what field you work in, you will need to understand the language of data. With DataCamp, you learn data science today and apply it tomorrow.

  • Udemy and Coursera – They both offer many data science courses. You could also check them out.

Want a free introductory course in data science?

Check

Self-Learning:

This is the alternative to getting formal education in the field. With the right dedication, resources and consistency, you can learn the skills needed to get your career off the ground. This is how you can do it;

  • Know the field of data science you want to enter, it’s also okay if you don’t know any specific field. Then review the modules of courses offered by different schools or platforms on such field. Take data analysis for instance, visit the website of schools or online training providers and look at the modules they are offering, this will help you understand the knowledge areas you need to cove like Multivariable Calculus, Linear Algebra, Python, R, Probability, Statistics, etc.

  • Next step is to gather and build the resources you need to pursue your self-learning. Collecting these and following it through is usually a challenge, and a lot people give up. But self-learning requires a lot of drive and is not for everyone. There are many free online resources for each of the modules you collected above and want to learn. Check this out, there is an abundance of free resources available online for each module or subject area, but you will hardly find complete free online resources that coves the different subject areas of a specific Data Science field. Therefore, you should collect them individually. You can collect for Python, Excel, Statistics, etc. But you can’t find a free course that covers these modules completely.

  • Build projects or products. This is best way to learn by yourself. You can build a website, analyst datasets, cool visualizations, anything at all. Just keep building as you learn. Check Kaggle and Kdnuggets for projects.

Plug yourself into the community.

Attend an interesting talk, learn about data science live, and meet data scientists and other aspirational data scientists. Connect with data scientists on LinkedIn and network with them. I am going to compile a list of popular blogs and communities for data science, be sure you check back under the data section here to get that.

Jobs.

Data Science jobs are becoming increasing common in Nigeria and this is a welcome development, most of these are also entry or mid-level, with 1 – 3 years’ experience required. Internships are the easiest way for beginner to get into the industry, this is usually to complement your learning. You will not see internship positions advertised, so you will need to reach out to companies you know that have structured data departments, ensure you update your LinkedIn too. Set up google alerts and LinkedIn alerts for jobs, and sign up to popular blogs to get notified when jobs in the areas open.

Look out for skills, more online courses and choosing specialization in Data Science, under the Data section of the site.

    2590
    4