What STR Data Exists?

What Short Term Rental (STR) Data exists for collection?

The STR industry has been growing attracting more investors, vendors and desire for data to stay ahead of the increasing competition in the space. But what data exists and can be leveraged to stay ahead of the competition? We have been scraping STR data since 2012 to help build STR products and companies and we can help you understand the data landscape. 


Main forms of Data

There are a few main forms of data that can be leveraged in the STR space. 

Sourced Data

Sourced data is data that is collected directly from the source. Managers may choose to upload their data to aggregators who show their performance against others in their market. 


Sourced data is typically more accurate and complete than other forms of data, but it can be more time-consuming and expensive to collect. There are also biases in who is reporting and a much limited population of those who give their data away. Sourced data is often anonymous in reporting tools and lack features to drill deeper into the individual property level or due to the limited numbers of managers reporting in, have limited ways to pivot the data. 


Since sourced data is aggregated the results are typically market level reporting on the headline KPIs such as ADR, Occupancy, RevPAN, available nights and demand comparisons. 

Scraped Data

Scrapped data is data that is collected from websites or other online sources such as Airbnb, Vrbo, Booking or specific manager websites. This can be done manually or with the help of software. Scrapped data can be a quick and easy way to get a lot of data and have the widest view of what is out there. 


While scraped data offers the widest view of the market and every listing that is out there, it is important to keep in mind the limitations. Since the data is scraped, individual reservation rates or occupancy type (block or booking) must be inferred with AI and machine learning using a healthy amount of sourced reservations to model against. There will be some natural median of error but as long as the median of error is within a reasonable bound, then it can unlock massive advantageous. 


Scraped Data - Listing Information 

The listing level data scraped offers the highest confidence data since it is all the attributes about the property that are public facing. This data requires no inference or modeling since it is the advertisement of the property. 

The data points include: 

Scraped Data - Manager Information 

The scraped data also offers information on the manager of the property itself. When choosing a property, the management of the property is very important to guests and their level of service they will receive. 

These data points include: 

Scraped Data - Calendar Information

The calendar is one of the most important features to scrape. While the listing gives you information for which you can aggregate, filter or combine your data the calendar information are the insights that lead to performance metrics and revenue. 

These data points by calendar date include: 


With calendar data, you cannot just scrape once to be able to gather all the data you need for revenue and performance data. For calendar data you will need to scrape every listing on a daily basis in order to identify changes to the calendar. 


This very complex process is where having a strong data scraping company to support you will save hundreds of thousands in staffing and server costs and a year or more for a useful scraper. This is where Hungry Robots can help. Additional information on our calendar scraping data can be found here. 

Scraped Data - Calculated Values 

Since the scrape data for calendars is very limited and raw, the most important information is calculated on top of that. Tracking the change in price, when dates are now longer available and which dates became unavailable with that will help you to build a robust view into STR performance. 


To calculate revenue, occupancy, RevPAN and more you will need to have a very solid booked and block model to infer which dates where booked as a reservation and which were blocked due to owner holds or maintenance. Hungry Robots uses sourced reservations on over 250,000 listings to model the probability of an unavailability being either a booking or a block. With that information, Hungry Robots data can be leveraged for occupancy, revenue and pacing information. If you want to learn more about how Hungry Robots models this, schedule time with an expert! 


Leveraging an accurate booked and blocked model with booking date, the below calculations can be done: 


To ensure the market is not being double or triple counted, another modeled data Hungry Robots provides in the data feed is our Crosswalk file where we map listings across multiple distribution channels together. When measuring a markets performance it is important to adjust or remove duplicates to avoid double or triple counting. However, if you are building a pricing model you may want to intentionally overweight multi-channel listings and in which case leaving them in will do that for you. 


With a good crosswalk file you can identify units or homeowners only on one channel, channel importance by market, premiums or differing booking behaviors across channel or upside opportunity in a target acquisition. The channels can leverage to allocate marketing dollars for where they have not fully penetrated the inventory in that market. The crosswalk files help you identify where the TAM is being marketed, how much is offline and more! 


Leveraging an accurate crosswalk file the below calculations can be done: 


Agency or Government Statistics 

The last form of STR data that can be collected is from Destination Marketing Organizations (DMO), tourism boards or federal government bodies such as the US Census. These different organizations collect data, tax receipts and census information to provide data at the block group, city or state level and can offer a healthy history of data. 


For example the US Census will publish how many seasonally vacant homes there are in different jurisdictions to help provide a larger TAM number than are on channels like Airbnb, Vrbo etc. Not all seasonally vacant properties are made available for rent, and of those that are not all of them are listed on each channel. While Hungry Robots offers a robust crosswalk file to get the total market, the US Census figure provides a good upper bounds on inventory available. 


Another example is the Tourism Board of Hawaii which will share flight arrivals by country and to which county they are flying. Since Hawaii is a fly to destination where all tourists must pass through government surveys there is a healthy amount of tourism data collected and shared. 

Uses of STR Data 

While the uses of STR data are technically limitless we have put together a few examples below. 


We have a full article here on what can be built and leveraged off of the STR data for more information. 


If this article has you excited to get your hands on STR data, Hungry Robots is happy to help. Click here to schedule time with an expert to learn more about how Hungry Robots can help your STR data needs.