Data acquisition process for real estate online marketplace


A North American online marketplace for real estate investing which provided post-purchase market intelligence, portfolio analysis, and management oversight for investors.


The client wanted to empower its users with better, data-driven pre-purchase tools to assess investment opportunities. The required data was coming from different public and private sources, including free and paid databases, websites, and newsfeeds. The client did not have the required in-house expertise to solve the following challenges:
  • Gather data from highly heterogeneous sources
  • Handling very large sets of data

SG Analytics supported the client by automating the data acquisition process and creating an end-to-end data standardization and management solution.


SG Analytics’ data scientists developed a data ingestion tool to collect all structured and unstructured data types and standardize the data across similar sources. The data ingestion tool was leveraging the following technologies:
  • APIs to collect feature data from relevant public and commercial databases.
  • Crawlers to index and screen related websites and scrape the most relevant information.
  • AI tools to pre-read, structure and classify newsfeed data from various sources.

SG Analytics deployed MapReduce techniques to crunch all data effectively and prepare it for further processing.

Tools Used


Value Delivered

SG Analytics automated the data acquisition and consolidation process across various data sources.
The client extended the project scope and requested SG Analytics' support in the data processing and modeling as well.