Data analysis plays an important role in the daily life of workers in any financial institution. The first boom of using data analysis in finance institutes was in the 1980s. By using data and computers to support decision-making, companies doubled their revenues. This led to an exponential increase in the number of data analysts, as the basic economic rule tells us that demand creates supply.
This article is about a small dig into the bottomless pit of data analysis. First, let's describe the project and break down the idea into smaller tasks. The foundation of this project is that I always struggled to find a recipe that involved products I had available at home. I had a choice to give up and let the system win, requiring me to go out and buy the missing ingredients, or create an easy-to-use system that uses AI to generate recipes using ingredients that I have at home right now. Some of you might have reasonable concerns and argue, "I have seen a model like that ages ago." You are right. However, the idea of this project was not only that model but also gathering recipes from the internet and putting them into an easily accessible system, and that's where data analysis comes in.
Now that we know our goal, we need to know how to achieve it. So, the good idea is to break it down into smaller tasks:
Decide on the system and name of this project.
Create parsers (programs used to scrape data from websites).
Create the code for this system, allowing users to communicate with it.
When I tackled this problem, I thought a Telegram bot would be the best choice as a platform for this system out of various options. So, the first task was successfully accomplished. Here's the link to this bot: https://t.me/Food_to_cook_bot.
Now it's time for coding. Coding might be new for some of you, but there is nothing difficult, and anyone can easily figure out how to write simple code. So, I needed to create a database with all the recipes in it. I will not go into the process of coding, but in simple terms, I used the site-package's Selenium to access the link and create a file with the necessary information embedded in the web page. Then, I used BeautifulSoup to scrape it out and put it into a dictionary file.
Here's what some of the code looks like:
When all the data is scraped, we need to clean it. Cleaning the data refers to the idea of ensuring that all the data meets certain criteria before being used in the future. In my case, the data was scraped quite accurately, but in some cases, there were added symbols. So, I had to write a few more lines of code to change all of my data into the same format. Then, I wanted to pick up three main ingredients in each recipe. Luckily, I found a model in HuggingFace that allowed me to do it.
When all the data is collected, we need to create actual code that allows the user to communicate with our bot and get all the information they need from it. This took me quite a while, but the result was worth it. As mentioned before, the main goal of this article is not to show the code, but to describe what an average data analyst has to do. This project allowed me to do so.
Here are the lines of code that took me the longest to figure out how to do. They are waiting for the user to press the button with the name of the dish. When they do, the bot sends them a card with all the necessary information.
I thank you for your time. If you have any questions about the coding part of this project or more about the theory behind it (like the HuggingFace model), I welcome you to leave a comment
Sensational! A great insight into analytics for beginners. Cant wait to hear more about it:) Absolutely glorious, Illia.