Web Scraping Tutorial for Grepsr Browser Extensions
Written by Asmit Joshi on June 13, 2018
We designed Grepsr Browser Extensions to make data extraction simple for all of our customers — whether they’re technically in tune or not so much.
Here we give you a simple web scraping tutorial about how to use our browser extension to easily extract data from an online shopping platform (we’ve selected Best Buy).
Web Scraping Tutorial for Grepsr’s Browser Extension
- On your web browser (Google Chrome and Microsoft Edge supported for now), go to the web page from where you want to extract the data.
- Click the Grepsr icon next to the address bar (the blue ‘g’) to activate the extension.
- Once the extension has loaded, navigate to the data field you want to extract and click it.
- If you’d like to extract similar fields from the same page, click one of the those items and the rest should also be tagged automatically.
- Please note that sometimes other unrelated fields also get tagged, which you can easily remove by simply untagging one of those fields. The rest of the unwanted tags should also be removed now, and you should be left with only those you need.
- When you’re certain that you’ve got the desired field (also take note of the count that’s shown), click Save Selection.
- Enter the field name, and click Save or press Enter.
- Repeat steps 3 to 7 for other fields on the page.
- When you’re done, click Next to move on to the next phase.
- On the pagination prompt, select the type of pagination on the page.
- If there’s a “Next” or a “Load More” button, select the option and navigate to where the button is and click to tag it.
- Verify your fields on the next screen, edit field names, drag and drop items on the list to re-order, or click the trash icon to delete. Continue.
- If you want to tag items and extract more fields from each item’s details page, select the first option and Continue.
- The details page will now load with the browser plugin automatically activated.
- Tag, name and save the fields on the page.
- When you’re done, click Export.
- Verify your tagged fields on both the main and detail pages once again. Continue.
- Next, specify whether you need to be logged in to access the data on the website.
- If yes, the login credentials should come from the user. For this, enter the username, password and the login URL. To get the URL easily, click the icon to open the website on incognito/private mode, then copy and paste the login page URL from there.
- The next screen will show you the sample data as CSV and JSON.
- You’ll then be prompted to sign up or log in, in case you haven’t already.
- To sign up, either fill up the form, or simply use the Google Single Sign On option.
- Once you’re logged in, create a new project or choose an existing one.
- Do the same with the report. The report is where your current dataset will appear.
- Finally, start crawling!
- You’ll be redirected to the Grepsr app platform where you can view the data start to populate the table.
When the extraction is complete, you’ll receive an email with a download link to your data as CSV.
Make the most of our app’s handy features as well, such as Schedules to automate extraction, and Data Delivery to seamlessly integrate your exports with popular filesystems you’re already using, like Dropbox, Google Drive, Amazon S3, FTP and more.
We hope this web scraping tutorial proves to be a useful guide so that your Grepsr experience is fun, intuitive, and at the same time productive for your personal and business needs.