Friday, December 21, 2018

Best books I read in 2018

Over the course of the past year I read at least 50 books (19 non-fiction), not counting flipping through a parade of glossy cookbooks borrowed from the public library and bedtime reading at least a dozen Boxcar Children books with my son.

Here are some of the highlights, in no particular order...

Evicted: Poverty and Profit in the American City
by Matthew Desmond
Part ethnography, part detailed analysis of Milwaukee rental market, part policy brief, this book may change the way you view wealth and poverty in America. The personal stories make the book a page-turner, but those anecdotes are supported with concrete evidence that there are structural flaws in our housing markets. This book should be required reading.

Call the Midwife: A Memoir of Birth, Joy, and Hard Times (The Midwife Trilogy Book 1)
by Jennifer Worth
I did not find out about the PBS television adaptation of Call the Midwife until months after I read the first book, but even if the series is great viewing, I'm going to recommend the books anyway. Worth's descriptions of 1950s Docklands slums in London, with their coal soot, limited plumbing, and rickets-inducing lack of sunlight are vivid, and her tales of midwifery, family structure, and social norms are compelling.

And I know I'm really late to this party, but The Emperor of All Maladies: A Biography of Cancer by Siddhartha Mukherjee was worth every minute spent reading its nearly 600 pages.

And I just started...
Eleanor Roosevelt, Volume 2: The Defining Years, 1933-1938 (Eleanor Roosevelt, 1933-1938)
by Blanche Wiesen Cook
This biography of Eleanor Roosevelt is brilliantly written, meticulously researched, and is at times laugh-out-loud funny. Did you know ER (as the author refers to her) . While ER was imperfect, as all humans are, she was a staunch advocate for women's rights and minority rights. In one incident, a racist woman complained about ER's work on racial equality, so ER replied that she knew black people who were “not only the equal of whites but mentally superior.” Emily Alpert Reyes recommended this book, so now I'm paying that recommendation forward.

Friday, July 13, 2018

Geek Jokes Galore

In case you don't already play along, every Friday I post a #GeekJoke on Twitter (@DataGeekB)

Over the years we've had demographer jokes, statistician jokes, economist jokes, mathematician jokes, and more. Here are a few of my favorites:

A demographer is just a mathematician broken down by age and sex.

I just saw my colleague with a piece of graph paper.
I think she must be plotting something.

Why do teenagers travel in groups of 3 or 5?
Because they can't even...

Did you hear about the mathematician who’s afraid of negative numbers?
She'll stop at nothing to avoid them...

First day on the job, a boss warns her new employee to avoid the statisticians in the cafeteria: "They're just mean."

To women who ask: "Should I continue to have kids after 35?"
Me: "I don't want to tell you how to live your life, but 35 is a lot of kids."

2 mutually exclusive categories went on a date.
It didn't work out.
They had nothing in common.

Biologist, Demographer & Mathematician sit at a cafe. Across the street they see a man and a woman enter a building. Later those two people reappear with a 3rd person. 
They multiplied! says the Biologist
It's an error in measurement! says the Demographer.
If 1 person enters the building now, it will be empty again, concludes the Mathematician.

There's a fine line between a numerator and a denominator...

An economist thinks that her equations are an approximation to reality.
A physicist thinks reality is an approximation to her equations.
A mathematician doesn't care.

If you live to be 100, you've got it made.
Very few people die past that age.

A farmer counted 297 cows in the field.
But when he rounded them up, he had 300.

Why do teenagers travel in groups of 3 or 5?
Because they can't even...

I made a chart of past relationships.
It has an ex axis and a why axis.



And a couple of geeky riddles:


What always goes up, never goes down?
Your age.

When your code won't run, what can you still count on?
Your fingers.

2 mothers & 2 daughters sat down to breakfast. They had 3 cups of coffee. Each person had exactly 1 cup of coffee.
How is that possible?
(Hint: If you've worked w complex household structure data, you'll figure this one out)


Thursday, May 17, 2018

2017 births: lowest teen and young adult birth rate on record, rising rates at older ages

Highlights from the preliminary 2017 birth data

Birth rates for U.S. teens and early 20-something are at (another!) all-time low, and birth rates continue to rise at ages 40 and older.

Another important milestone is that, for the first time on record (2016), birth rates for ages 30-34 exceeded the rate for ages 25-29.

As a historical demographer, who has some experience with fertility and mortality rate trends over the past century (and the century before), and I can say with conviction that the trend toward higher birth rates at ages 30+ is not really new. Birth rates for women ages 35 and older are not higher now than ever. (They're not even higher now than they were in the 1950s and 1960s.) I would argue that, rising birth rates among those in their 30s and 40s is more a return to long-run historical norms than an aberration. (First births at older ages is a newer phenomenon, the rate at older ages is nothing new.)


The National Vital Statistics Reports, published by the U.S. Centers for Disease Control and Prevention, provide historical birth rate data by age of mother as far back as 1970. Earlier years are available, but must be compiled from a variety of other sources including the older, and often PDF-scan-only Monthly Vital Statistics Reports and the U.S. Statistical Abstract. From those sources I collected data as far back as 1920, with complete annual data from 1935-present. The historical birth rates (births per 1,000 women) are shown in the chart above.

I am happy to share the raw data upon request. Feel free to contact me for more information.

Friday, May 11, 2018

A note about measuring maternal mortality in Texas

You may (or may not) have heard that Texas has the highest maternal mortality in the nation, as a result of recent, dramatic increases in reported maternal deaths.

Or... it doesn't.

Researchers working for the state of Texas conducted a reassessment of the 2012 maternal mortality records. Researchers hypothesize that data entry errors led to records being inaccurately classified as maternal deaths. Knowing, as we do, that maternal mortality reporting has some considerable data accuracy challenges, this seems on the surface to be a good faith effort.

That said, I have concerns with the methods in the Texas analysis (explained in more detail below). In addition, while the authors do a nice job of stating the limitations of their work--data are not comparable to other years or other locations--the news media did exactly the opposite: compare to other places and times.

In 2012 Texas had 147 mortality records with an ICD-10 code indicating maternal mortality (codes A34, O00–O95, or O98–O99). Texas researchers used record matching and extensive death and health record review for the 147 maternal mortality coded deaths. Through this process the researchers identified birth or pregnancy status (within 42 days) at the time of death. This extensive review found a number of false positive results. Researchers then removed these deaths from the maternal mortality count. On this point, the analysis seems both reasonable and robust.

However, any analysis of data coding errors should clearly identify both false positives and false negatives. The search for false positives was (as described above) robust. The search for false negatives, on the other hand, used record matching alone. This may seem a minor point, but it is important because the robust methods used to find false positives were not similarly applied to find false negatives. Moreover, the record linkage process matched on SSN, name, and county of residence. Given their hypothesis of data entry errors, finding exact matches for all three of those open-ended data entry fields raises all sorts of possibilities for missed matches. In other words, the methods introduced potential bias in favor of finding false negatives and against finding false positives.

I recognize that individual case review for 9,000+ death records was probably implausible due to time and funding constraints. Still, more could easily have been done to try and identify false negatives. For example, they might have added a record linkage between all birth records and all death records.

And... Here's the piece that puzzles me most...

The Texas researchers posit (repeatedly) that the number of false positives is some artifact of newer data entry techniques. They state, specifically, that the upswing in the reported maternal mortality rate was driven by an increase in e-reporting:
"The percentage of death certificates submitted electronically increased from 63% in 2010 to 91% in 2012"
But... if electronic reporting was the problem, why wouldn't the problem have shown up in 2010 when 63% was already e-reported? Why do they think an incremental 28 percentage points was pivotal when first 63% was not? And, perhaps most importantly, why do they skip over 2011 when that year (not 2012) was the pivotal year for the increase in reported maternal deaths in Texas? (2011 was also the year TX began restricting family planning and reproductive health services.)
Texas Maternal Mortality Trend
Source: CDC WONDER, Multiple Cause of Death database, and natality database
Note: CDC reports 148 deaths in ICD-10 codes A34, O00-O95,O98-O99), TX reports 147


Wednesday, February 14, 2018

Valentines by the numbers (2018)

According to the National Retail Federation, U.S. consumers will spend nearly $20 BILLION on Valentines this year. (More than half of celebrants will give candy as a gift, but most spending will be on jewelry.)

While love may be in the proverbial air, young adults are not rushing down the aisle. Median age at first marriage continues to reach new record highs (now 29.5 for men and 27.4 for women).

While age at first marriage has been increasing, and it is incredibly difficult to get an accurate measurement of the rate of divorce, by all accounts divorce rates are falling and may be at a 40-year low. By Census Bureau measures, divorce rates peaked in the years changes in divorce laws that occurred in the mid 1970s, but then leveled off and fell slightly. Some of this trend can be attributed to lower marriage rates (fewer marriages lead to fewer divorces), but some is likely a result of people waiting longer to get married in the first place.

Longer life expectancy and lower divorce rates mean that marriage duration has (on average) increased in recent years. 80 percent of marriages last at least 5 years, and 68 percent last 10 years or more, according to data compiled by the U.S. Centers for Disease Control and Prevention based on the National Survey of Family Growth (2006-2010). This is an increase from the 2002 survey, in which 78 percent of marriages last at least 5 years and two thirds last 10 years or more.

Thursday, January 18, 2018

Where do go for data if the federal government shuts down (2018 edition)

Because we're in the countdown to shutdown again, data users should know that most federal websites will shut down when the government does... 

Here are some other helpful data resources, ranging from national to state and local downloads, to get you through any dark days with no federal data access:
You may also want to try the "Wayback Machine," an online archive of webpages.

For state-specific data... here are links provided by readers and colleagues around the nation:

Check the Clearinghouse of SDCs for a comprehensive listing of Census State Data Centers, or refer to one of these state-based resources:

Please tweet me @DataGeekB if you have recommendations to add to the list!

Special thanks to @mecline6@censusSDC, @SR_Spatial@MetroGram@CarlSchmertmann and @NDCompass for recommending several links.