Frequently asked questions - Main

Dynamic Demographics Product Guide

Product type
Data
Portfolio
Enrich
Product family
Enrich Demographics > Segmentations and Geodemographics
Product
Dynamic Demographics
Version
Main
Language
English
Product name
Dynamic Demographics
Title
Dynamic Demographics Product Guide
Copyright
2024
First publish date
2022

What is the average area/size Hex 9 (Uber H3 Level 9) polygons?

The following table shows a comparison of all H3 levels. H3 Level 9 is highlighted:

H3 Resolution

Average Hexagon Area (km2)

Average Hexagon Edge Length

Number of Unique Cells

0 4,250,546.8477000 1,107.712591000 122
1 607,220.9782429 418.676005500 842
2 86,745.8540347 158.244655800 5,882
3 12,392.2648621 59.810857940 41,162
4 1,770.3235517 22.606379400 288,122
5 252.9033645 8.544408276 2,016,842
6 36.1290521 3.229482772 14,117,882
7 5.1612932 1.220629759 98,825,162
8 0.7373276 0.461354684 691,776,122
9 0.1053325 0.174375668 4,842,432,842
10 0.0150475 0.065907807 33,897,029,882
11 0.002149643 0.02491056 237,279,209,162

(Source: https://h3geo.org/docs/core-library/restable/)

Is there a correlation between polygon size and data volatility?

Level 9 hexagons cover an area of approximately 0.1 km2. The obvious upside to such a granular spatial resolution is that it allows for identification of patterns on a detailed level or isolation for small areas of particular interest. The downside is that in a smaller spatial area, less data is collected per spatial unit.

A smaller sample size (compared to larger administrative areas) can lead to more volatile metrics. It also means that metrics such as dwell time can be shorter and less insightful when what is, in reality, one long dwell time crosses into neighboring hexagons and is split into several shorter dwell times.

Another downside of small hexagons is that time series analysis might suffer from data gaps. If a particular hexagon gets data during some – but not all – periods, comparisons over time become increasingly complicated. OAs increase in size with decreasing population density, ensuring that they have enough data during all periods to make time series analysis possible.

Smaller hexagons work best where data is dense, mostly in urban areas and areas with high visitation rates. Generally, whenever a metric is observed as very volatile in a smaller hexagon, it's advisable to use an OA for analysis.

What does the value 9999999 in the ORIGIN_AREA_ID and rank fields represent?

In the ORIGIN_AREA_ID field, 9999999 is used to aggregate origins with very low PERCENT_POP_T values, in order to reduce a long tail of origins. The value is applied only after the cumulative sum of PERCENT_POP_T surpasses 90% and the remaining origins contribute less than 1% each.

In rank columns, the value 9999999 is used in two cases. The first is to set the rank for the aggregate origin described above, which should not be included in the ranking of regular origins. The second case is when an origin does not show flows to the destination for some day parts, week parts, or ORIGIN_AREA_TYPE values. Whenever such data is present, the origin is ranked in the normal manner, but will receive a rank of 9999999 in cases where there are no flows to be ranked.

Is there a minimum number of mobility data transactions for an area that can be used in a dataset?

Yes. Every aggregation in the dataset must contain a minimum of 10 data points. However, it is important to understand that while there is a minimum number of individual transactions in each batch of 11, that doesn't necessarily mean that this data is from 11 individual devices. By utilizing 12 months of data in all calculations (except seasonality) and applying the >10 point minimum filter, Precisely reduces the risk of individual device origin-destination results for a location being introduced into the dataset.