Scrape Data From Facebook

How To Scrape Data From Facebook For 1000’s Of Leads For Free. No-Code

Hello Guys, I am Arpit. Today we will learn to How To Scrape Data From Facebook for 1000’s of leads without any code and for free.

We will scrape business information like emails, phone numbers, websites and addresses from Facebook

We will do it using a tool called Octoparse.

This is probably the most easiest and efficient tool to Scrape Data From Facebook.

I run a course AI for marketing in which I essentially cover how to leverage the power of Artificial Intelligence for marketing. I also cover hacks like these to know more attend the free webinar link is in the description.

Facebook has a lot of data about businesses.

Visit this link to see all business categories on Facebook to Scrape Data From Facebook.

You can access the links and the script for this video from my blog the link is in the description.

So this page has 100’s and 100’s of business categories.

This is a gold mine. I am sure you can definitely find your target audience from here.

They have just about any niche. I can’t even read there are so many of them.

So once you know the category you want to scrape just click on it.

Like I will click on social media agency they are a good target for my courses.

Now to scrape information from these pages we will build 2 tasks in Octoparse.

The first task will scrape the links of these pages.

When I was testing I scraped more than 2000 links but I am sure Facebook has way more than that

The second task in Octoparse will scrape the about pages of these businesses that’s where we have the contact information displayed.

Alright let’s begin.

First you need to download the latest version of Octparse the link for it is in my blog which is in the description.

Once you have created your account and logged in hover on new then click on advanced mode

Then under the website section put www.facebook.com and hit save.

Now we need to train Octoparse to login into our Facebook Account to Scrape Data From Facebook.

So click on the email field on Facebook – then click on enter text – put you email id and hit confirm.

Then click on the password field – then click on enter text – put your password and hit confirm.

Then click on the login button – and then select click button.

You will see it logging in on your behalf.

Then under the workflow section on the left, hover your cursor on the downward arrow and you will see a plus sign coming up.

Click on it – then click on the downward arrow to open up more options.

Select open page.

Then get into the action settings of go to web page.

Head on to Facebook to copy the link of the category page that you want to scrape.

Come back to Octoparse and paste it in the URL field. Click on ok.

Then click on auto detect web page to scrape.

Once completed you will see it has detected some data that we could scrape but we don’t want all this data we just need this one column which has the URLs for all these pages.

So I will delete the other columns.

Under the tips widget you will see it has already select paginate to scrape more pages.

If you click on check, you will see it has detected the next button so this means it will scrape the data from all pages on its own.

It has also added a page scroll. We just need to click on create workflow and it is done for us.

Then from the top click on Run and then run task on your device.

Then another window will pop up and you will see it working.

Isn’t this really cool and simple to do. Let me know in the comment section.

Then you can collect as many page links you want.

Like you can see here I have extracted more than 3000 thousand pages.

Then click on stop run, then extract data, then select excel and click ok

Name your file. Then click on open file.

We don’t need the column heading you can delete it.

Let me quickly scroll down to show you these 3000 links

Now we need to add about at the end of these URLs.

About is where you have all the contact information.

We need to use an excel formula so go to the next column and put the formula it is given in the blog as well =A1&””&”about”

A1 is the cell and about is what to want to add at the end.

On the sheet is ready. Save it.

Head on to Octoparse.

We will create a task.

Hover on new. Select advance mode.

From input URL’s click on import from file.

Click on select.

Choose the file click on open.

Select sheet 1.

Select column as B1.

Click on save.

Then you will see go to web page step has been added inside a loop.

We need to configure it so click on the action settings.

Then inside after loading the page check scroll down the page after it is loaded.

Then select scroll to the bottom of the page.

Then from the wait time dropdown select 1 second. Click on ok.

We need to make Octoparse log in first.

So we will add a step above this loop.

Click on the + icon on the downward arrow.

Expand the widget and choose open page.

Then get into its setting.

Under URL field type in www.facebook.com and hit ok.

Make it login once again.

Click on the email field – then click on entre text – put you email id and hit confirm.

Drag the step onto the top

Then click on the password field – then click on entre text – put your password and hit confirm.

Again drag the step.

Then click on the log in button – and then select click button and drag the step.

Then inside the loop URL’s box click on go to web page.

Now we need to train the bot on what all to scrape.

So first we need the name of the page.

So click on it and select extract the text of the selected elements.

So click on the website and again select extract the text of the selected elements.

Do the same with their email.

We can also scrape address and phone numbers.

Since this page has not shared it we will train Octoparse on some more pages.

So if you hover over loop URL’s step you will see a list symbol. Click on it.

Select the 2nd Facebook page and then click on go to web page step.

Now here you can see at the bottom it has learnt how to Scrape the page name but we need to train to Scrape Data From Facebook and grab other details.

Click on their website and then select extract the text of the selected elements.

Do it with the phone number and the email id as well.

Now we will go on to the next page.

So from the list select the 3rd URL and click on go to web page.

So here we can see it has collected all the information.

Ok so we will go on the next page.

We still need to train it on scraping the address so we need to find an example.

I actually trained it on about 15 pages the more you do the better it is.

I trained it for how to scrape addresses.

You need to show it enough examples so it picks on it.

Facebook will actually change the elements name so we cannot really use the Xpath method.

I can vouch for this method that I am using I played a lot with the tool it will collect a lot of data for you.

I going to fast forward the video till I train it on about 15 pages.

Now since I am done.

I will save the task and then hit run.

Then click on run task on your device.

There will be a new window where you can see it working.

At the bottom you will be able to see number of data line extracted.

On each page it is scraping 10 Facebook pages.

So you can stop when you think you have enough data.

Again I will fast forward the video.

Now you can see in this much time it has scraped these many pages.

I will click on stop run. Yes confirm. Click on export data.

Select Excel and click ok.

Name the file and then click on open file.

Here is the excel sheet with all scraped data.

Have a look.

To count the number of email scraped press control f and put in @ click on find all.

You should validate these emails before you do an email outreach campaign.

I am sharing the link a nice very email validator for you.

You will also need to arrange this data. For that I use a free tool called Dataprep by Google.

I cover this tool in my AI for marketing course feel free to be a part of it. To know more attend the free webinar in the description.

[/vc_column_text][/vc_column][/vc_row]

Leave a Comment

Your email address will not be published.