Filling the New Column with Aggregate Data - MapBasic - 2023

MapInfo MapBasic Reference

Product type
Software
Portfolio
Locate
Product family
MapInfo
Product
MapInfo > MapBasic
Version
2023
Language
English
Product name
MapBasic
Title
MapInfo MapBasic Reference
First publish date
1985
Last updated
2023-09-12
Published on
2023-09-12T16:32:32.686312

If you specify a From clause, you can calculate values for the new column by aggregating data from the second table. To perform data aggregation, specify a Set To clause that includes an aggregate function.

The following table lists the available aggregate functions.

Function Syntax Value Stored In The New Column
Avg() Avg( col ) Average of values from rows in the source table.
Count() Count( * ) Number of rows in the source table that correspond to the row in the table being updated.
Max() Max( col ) Largest of the values from rows in the source table.
Min() Min( col ) Smallest of the values from rows in the source table.
Sum() Sum( col ) Sum of the values from rows in the source table.
WtAvg() WtAvg( col, weight_col ) Weighted average of the values from the source table; the averaging is weighted so that rows having a large weight_col value have more of an impact than rows having a small weight_col value.
Proportion Avg() Proportion Avg( col ) Average calculation that makes adjustments based on how much of an object is within another object.
Proportion Sum() Proportion Sum( col ) Sum calculation that makes adjustments based on how much of an object is within another object.
Proportion WtAvg() Proportion WtAvg( col , weight_col ) Weighted average calculation that makes adjustments based on how much of an object is within another object.
Note: Count returns an integer value. All other functions return a float value. (No MapBasic function, aggregate or otherwise, returns a decimal value. A decimal field is only a way of storing the data. The arithmetic is done with floating point numbers.)

Most of the aggregate functions operate on data values only. The last three functions (Proportion Sum, Proportion Avg, Proportion WtAvg) perform calculations that take geographic relationships into account. This is best illustrated by example.

Suppose you have a Counties table, containing county boundary regions and demographic information (such as population) about each county. You also have a Risk table, which contains a region object. The object in the Risk table represents some sort of area that is at risk; perhaps the region object represents an area in danger of flooding due to proximity to a river.



1 County Boundaries 2 Risk Buffer Region

Given these two tables, you might want to calculate the population that lives within the risk region. If half of a county's area falls within the risk region, you will consider half of that county's population to be at risk; if a third of a county's area falls within the risk region, you will consider a third of that county's population to be at risk; etc.

The following example calculates the population at risk by using the Proportion Sum aggregate function, then stores the calculation in a new column (population_at_risk):

Add Column Risk(population_at_risk Integer) 
	From counties 
		Set To Proportion Sum(county_pop) 
		Where Intersects 

For each county that is at least partly within the risk region, MapInfo Pro adds some or all of the counties county_pop value to a running total.

The Proportion Sum function produces results based on an assumption―the assumption that the number being totalled is distributed evenly throughout the region. If you use Proportion Sum to process population statistics, and half of a region falls within another region, MapInfo Pro adds half of the region's population to the total. In reality, however, an area representing half of a region does not necessarily contain half of the region's population. For example, the population of New York State is not evenly distributed, because a large percentage of the population lives in New York City.

If you use Proportion Sum in cases where the data values are not evenly distributed, the results may not be realistic. To ensure accurate results, work with smaller region objects (for example, operate on county regions instead of state regions).

The Proportion Avg aggregate function performs an average calculation which takes into account the percentage of an object that is covered by another object. Continuing the previous example, suppose the County table contains a column, median_age, that indicates the median age in each county.

The following statement calculates the median age within the risk zone:

Add Column Risk(age Float) 
	From Counties 
		Set To Proportion Avg(median_age) 
		Where Intersects 

For each row in the County table, MapInfo Pro calculates the percentage of the risk region that is covered by the county; that calculation produces a number between zero and one, inclusive. MapInfo Pro multiplies that number by the county's median_age value, and adds the result to a running total. Thus, if a county has a median_age value of 50, and if the county region covers 10% of the risk region, MapInfo Pro adds 5 (five) to the running total, because 10% of 50 is 5.

Both Proportion Sum and Proportion Avg keep running totals. For example:

If half the county falls in the risk area, then you take half the value and add it to the running total. If it is 10%, then you add 10% of the value to the running total. However, Proportion Avg should be an average, so if 4 counties intersect the risk area, then you take the running total and divide by 4.

If county1 intersects the risk region, and 50% of county1 intersects the risk region, and the population of county1 is 66, then you add 33 to the running total.

If 30% of county2's area intersects the risk area and the population is 100, then add 30 to the running total.

If county3 has 20% overlap with the risk area and has a population of 50, then add 10 to the running total.

If county4 has 10% overlap with the risk area and has a population of 60, then add 6 to the running total.

Then the Proportion Sum is 33+30+10+6 = 82

Then the Proportion Avg is (33+30+10+6)/4 = 20 (or 21 depending on round off, but I think 20).

Proportion WtAvg is similar to Proportion Avg, but it also lets you specify a data column for weighting the average calculation; the weighting is also proportionate. For example:

Weighted Average should take a weighted value from another column; for the previous example there is another column called RuralPercent in the County table. If the risk is for flood and the rural areas are where it floods, then for risk you only want the population from the rural area.

If county1 has 50% overlap with the risk region, a population of 66, and a RuralPercent of 0.8, then add (0.5 * 66 * 0.8) = 26.

If county3, 4, and 5 are all 50% rural, then:

county3 0.3 * 100 * 0.5 = 15

county4 0.2 * 50 * 0.5 = 5

county5 0.1 * 60 * 0.5 = 3

Then the proportion weighted Avg is: (26 + 15 + 5 + 3)/2.3 = 21.3043