Where is your company based?
We are headquartered in Dubai, with our service operations in Nepal and USA.
How does the data subscription work and how is it priced?
Customers with recurring data needs are priced monthly in arrears. There is an initial one-time set up fee. Customers are either billed a flat monthly fee or based on metered usage. The latter is reserved for high volume projects. Other billable fees for consulting and technical support are agreed in advance before they’re added to your invoice.
Do you have any referral program?
Yes, we do have a Referral Partner Program where our partners are rewarded handsomely for providing us qualified leads.
For more information about this and our other partnership models, please visit our partnership page.
Can I get the raw HTML along with structured data?
Certainly! We can pull the underlying HTML along with structured data. We can also have the HTML output automatically deposited in your cloud storage platform.
How does Grepsr ensure quality data?
We’ve built several quality controls – both platform-based and using humans in the loop — to meet quality standards.
Platform-based controls
- Notification triggers in the crawler that executes during run-time to identify chokes, failures during crawler execution. System monitors to arrest system-wide errors
- Define data schema to set acceptable formats. Anomaly detection using historical data
- Quality and operational dashboards to monitor project health. Custom reporting for key accounts to analyze key metrics
Quality experts
- Validate initial setup with customer consultation to ensure quality compliance
- Manually QA a randomized sample set per SLA terms
- Proactive communication and resolution (<24 hour unless wholesale changes on source)
How long has your company been operating, and do you have a large client base?
We’ve been in business since 2012 and have worked with many Fortune 500 companies. Most of our clients are from the US and the rest of the world. We have grown continuously each year and have a strong team of over 90.
Which industries do you serve with your services?
We provide data to companies in many industries. We are industry agnostic. However, most of our customers are from global consulting firms, eCommerce, real estate, travel and hospitality, investment banks and healthcare sectors.
Do you collect all the data in-house, or do you work with third parties?
All the data we collect is in-house using our proprietary engine and infrastructure.
Can we see a proof of concept before we commit to a payment plan?
In order to pull data, we need to set up crawlers no differently than how we would in a full-fledged project.Because of the time and effort this entails, we only take on a project once payment is received.
That said, for every project, we provide a sample dataset before moving on to full production. This ensures data is per scope and quality criteria are met. If you’re not satisfied with the sample, then we are happy to make modifications or even offer a full refund.
Can I schedule crawlers to automate data collection? Or run them manually when needed?
Absolutely! You can run manually crawlers on an ad-hoc basis or create recurring schedules to automate your crawl runs. Scheduled runs work like clockwork simplifying your data acquisition workflow.
Read more about scheduling crawlers in our platform documentation here.
How will I receive my data once it’s scraped?
For large scale data collection, we automatically deliver the output to your preferred cloud storage location. We support Amazon S3, Google Cloud, Azure Cloud, Dropbox, Box, FTP and more. You must authorize the respective filesystem before we can store the output.
Output can also be manually exported from the platform. Learn more about how you can integrate with Grepsr in our platform documentation here.
Can I schedule the data collection to run automatically, or do I need to do it manually each time?
Absolutely! You can run crawlers manually on an ad hoc basis or create recurring schedules according to your requirements. Scheduled runs work like clockwork, simplifying your data acquisition workflow.
In what file formats will I receive the data?
We support standard formats such as CSV, XLSX, JSON, XML, and YAML. Contact us if you need a custom format not supported out of the box.
Can I add my colleagues to work on data collection projects with me?
Yes! Grepsr’s data management platform makes it easy for remote teams to collaborate on their data projects. You can also manage the access levels of your colleagues, so you always have control over who has visibility and into what.
Can the system run the same data collection task across multiple web pages?
It depends on the individual sites. Even though the site may look similar in structure, cases may require a different setup as they might use a different system internally. We can only confirm once we have the list.
Can the platform export data to formats like Excel?
Yes, we can extract text and tabular data and export it as Excel or any other required relevant file format (CSV, JSON, Excel, XML).
Can the system extract data from documents like PDFs or Word files?
Yes, the system can handle PDFs and DOCs. However, if the PDFs or DOCs do not have a similar structure across the board, this can become more challenging. The layout or structure also affects how PDFs are extracted.
Is web scraping legal?
Scraping publicly available data is perfectly legal so long 1) it does not violate the source site’s terms of service, 2) data is not copyrighted, and 3) data does not contain Personally Identifiable Information (or PII). Fair to say, this is a contested and misunderstood topic. You can read more about the legalities of web scraping in our blog here.
How do you ensure your data collection practices are compliant with laws and regulations?
Our governance and compliance function is structured to align with ISO 27001 standards, ensuring robust information security management across all aspects of our operations. We maintain and implement comprehensive policies for our Information Security Management System (ISMS), covering areas such as data collection, storage, and distribution. These policies include strict protocols for secure storage, access control, regular audits, and compliance with relevant legal and regulatory requirements. By adhering to ISO 27001 standards and maintaining a rigorous ISMS framework, we uphold the highest standards of data security and privacy throughout our operations.
How do you confirm that you are allowed to collect and share the data you provide?
The data is collected exclusively from public sources. These sources include publicly accessible websites where the data is freely available to the public without any access restrictions or requirements for special permissions. We do not claim ownership of the dataset, as it is sourced from publicly available information. Our role is limited to aggregating and organizing the data for easier access and analysis.
How does your subscription model work, and how is pricing structured?
Customers with recurring data needs are priced monthly in arrears. There is an initial one-time setup fee. Customers are either billed a flat monthly fee or based on metered usage. The latter is reserved for high-volume projects. Other billable consulting and technical support fees are agreed in advance before they’re added to your invoice.
What makes your service stand out from competitors?
USP
Managed Service & Platform Based: Fully managed plug-and-play service. Customers need not worry about setting up crawlers or working around technical hurdles.
Fast & Reliable Services: Efficient delivery SOP with a quick turnaround time. Highly responsive support team to monitor and manage your data pipelines.
Advanced Quality Monitoring: Scalable quality control processes using both technology & dedicated reviewers to ensure the highest data quality consistently.
Flexible Pricing: A pricing model that works for you. Tailored pricing depends on your data requirements, volume, and use case complexity, so you pay based on your needs.
What types of data collection services do you offer?
We provide bespoke data solutions, i.e., the customer (you) can define any data structure you want pulled from any target sites. Our experienced data acquisition team will set up the scrapes for you and deliver the data in the required format. In addition to the data we collect, we will also help you QA, dedupe, and normalize it when needed.
What kind of ongoing support do you provide after a project starts?
We assign customers to a dedicated Customer Success team member. They will be your main point of contact for any support, liaison with you during the setup, and ongoing support. As long as you’re a recurring member of our team, we will adjust the scrapes for any changes on the site for free.
How do you ensure long-term success and partnership with clients?
We believe in growing with our partners; we have started with an initial set of requirements with our clients and have grown as their needs have grown. The partnership’s success lies in clearly setting expectations on timing, delivery, and cost estimates and continuing with very transparent pricing options for our customers and quality data.
Do you offer live technical support? If so, when is it available?
Depending on the complexity of the fix, minor fixes will take one business day, and major fixes will take two to three business days. We also have a support and collaboration channel that you can use on our platform if there are any problems with the data.
What methods do you use to collect data from websites?
Grepsr employs web scraping to extract data from websites. We provide Data-as-a-Service, wherein we will set up the bots/scrapers for you and run them on our cloud infrastructure. You can access the Grepsr platform to send requirements, view data, schedule extractions, set up data delivery, and communicate with our team. This platform is hosted on the cloud and accessed through the web.
Do you outsource any part of the data collection process?
No, we do not outsource data collection and distribution of the dataset. We have a dedicated team of engineers who write the specific crawler meeting your requirements, and these crawlers run on our cloud-based platform.
How much data can I collect using your service?
There is no limit to how much data you can collect. Data projects are priced based on scale and complexity.
How long does it take to collect data once I’ve provided the requirements?
To put an exact timeframe on our lead time, it strictly depends on the data requirements, such as the number of sources and complexity. Our customers value us for quick turnaround; on average, a typical project is completed in days, not weeks.
We set clear timeline expectations beforehand and aim to get the initial sample ready within a couple of days.
Can you scrape images and other files from websites?
Yes! Our web crawlers can scrape images in the form of either URLs or files. Scraping as files requires extra effort and, as a result, will incur an additional charge. The image files will be zipped and emailed/synced with the rest of your data.
Do you still have a question?
You can always contact us. We'll try and get back to you as soon as possible!