Query data

Patrick Hopkins
6 min readMay 12, 2021

--

Adult section:

I ran data on the Successful Queries spreadsheet. Check it out, but check the multiple warnings, too. Data without context is just a pile of numbers.

First, my work was possible because of Dr. Carissa Taylor, whose blog is useful for more than just her spreadsheet. Second, these data operate from a few flaws, none of which are Dr. Taylor’s fault:

  1. The data are skewed heavily toward the pre-2017 publishing world, so if something about queries has changed since 2016, these data likely won’t indicate it. Only 14 of her 119 adult data points with years are from after 2016. Of those 14, only two are from this year.
  2. The goal of a query isn’t to get a book deal. The goal of a query is to get a request for pages. Dr. Taylor acknowledges this on her site. It’s great to have data on queries that got the writer a deal, but this is not a guide to get a book deal. It’s data on queries that helped authors get book deals.
  3. We don’t have data on queries that didn’t work. Do unsuccessful queries look mostly the same? Many questions about those queries cannot be answered because there is no Unsuccessful Queries database. So the analysis I have run is interesting, but there’s nothing to say that the average word count of a successful query isn’t within a word of the average word count of an unsuccessful query.

With that disclaimer out of the way, here’s what I did:

First, I separated the data by age range, since statistics based significantly on word count will differ if one book should be no more than 60,000 words and another book can be north of 100,000.

Second, I selectively removed incomplete data sets. For example, for analyzing query length as related to book length, I eliminated queries in which the book’s length was not mentioned or for which the query length was not indicated.

Third, I performed a light statistical analysis — calculating average and standard deviation — of: length, query length, book words/query word, paragraphs and paragraph length. I also looked at comp use patterns.

The data for adult books:

The average adult book length for the 131 relevant data points is 91,926 words, with a standard deviation of 19,437 words (rounding to nearest word throughout, since partial words do not exist).

The average adult query is 195 words, with a standard deviation of 62 words.

On average, an adult book has 526 book words per query word, with a standard deviation of 213 book words.

The average adult query has 2.67 paragraphs, with a standard deviation of 1.1 paragraphs.

The average number of words per paragraph is 80, with a standard deviation of 28.

Now, comps. I first looked at comp use by year and found it wavering in the 1/3 to 2/3 range. But since no year has 30 data points (needed for statistical significance), that was only so useful.

Comp use by book length was interesting. Among books whose queries used comps, the average length was 89,808 words.

Compless books were at 95,849 words.

The pattern held true for query length, with comp queries averaging 178 words and compless queries averaging 208. Queries with comps thus tend to be more concise, at 506 book words per query word to compless queries’ 462. (Given a 50,000-word book, a 100-word query is 500 book words per query word; a 200-word query for that book is 250 book words per query word. More book words per query word equals fewer query words, meaning shorter queries.)

Given the relatively few data points per year, I don’t think an examination of book length change would be useful. Book length did dive in 2012, but that dive is based on six data points. Similarly, queries don’t have enough data for a useful look at how they’ve changed. They may have shrunk in 2016 and then swelled to normal size in 2017, but that’s based on insufficient data.

Conclusions:

1) Comps are a nice thing, but these data don’t suggest they’re necessary.

2) A five-paragraph query of 300 words is a bad idea.

3) Professor! We need more data!

— —

Young adult section:

1) Average book length

The average book length for the 253 relevant data points is 76,007 words, with a standard deviation of 14,871 words. But not all book lengths are created equal. Books that got agents in 2010 are shorter than books that got agents in 2016. Here’s a chart of average book length by year — but these averages should be looked at with suspicion because five of them include fewer than 30 data points, which is the minimum for statistical reliability, as far as I know (bolded years and averages indicate at least 30 data points):

2017 82,286

2016 83,991

2015 80,696

2014 76,552

2013 71,394

2012 74,069

2011 76,667

2010 75,535

2009 63,200

Why was 2013 so low? Did something happen to cause such a sharp and steady increase in 2014 to 2016? As I said, it’s hard to assume some of these numbers are closely representative of the years in question, but they do suggest a trend. 2) Average query length The average query length is 200 words, and the standard deviation is 46 words. On average, a book has 391 words per query word, and the standard deviation is 108 book words.

The self-correction in the data is fascinating. Look at how the number of book words per query word (higher number=shorter query) fluctuates:

2017 368.5965615

2016 403.2156285

2015 398.8612061

2014 378.5899116

2013 394.7375107

2012 369.6337888

2011 394.0961558

2010 406.6422939

2009 382.1260732

I have again bolded the years with at least 30 data points. But look at that self-correction. Queries go up, then down to 394ish, then down, then up … to 394. Then down, then up again to around 398, then up a touch more, then … plummet. It’s like people subconsciously “realized” they were writing distinctly short queries in 2016 and lengthened them.

Another interesting thing is the mental separation of the query from the book. If the tasks were more linked in the writer’s mind, we might expect queries to get longer as books got longer, but between 2014 and 2016, while books are steadily gaining 12,000 words, the query is staying relatively flat.

3) Paragraphs The average query has 2.9 paragraphs, and the standard deviation is .92 paragraphs. The average number of words per paragraph is 74, and the standard deviation is 25 words. 4) Comps

A) 51 of 281 entries did not indicate comp use either way.

82 of 281 queries did not use comps.

143 used comps.

B) Running a yearly analysis of comps showed nearly universal use from 2015 on. The chart below shows the year and how many queries contained no comps/no data/comps:

2017: 3/4/9

2016: 3/3/20

2015: 6/3/18

2014: 11/7/13

2013: 17/6/18

2012: 8/12/15

2011: 15/5/16

2010: 9/5/16

What seems pretty clear to me is that something happened in late 2013/early 2014 — when people started finishing longer books — and has persisted.

C) Finally, I looked at book length versus comp use. The average comp-using book is 75,920 words long; the average compless book is 75,464 words long. I don’t see statistical significance in that increased book length; 500/75000=.0066.

Conclusions:

1) Young adult books are longer than they were in 2013.

2) Books like to be about 394 words per query word. An average 82,000-word book is going to get you a query of 208 words.

3) If your young adult query is 450 words and 10 paragraphs and doesn’t use comps, rethink it. The longest successful young adult query used 394 words, the most paragraphs was 7 (2–4 is average), and comps are expected.

— —

Middle grade section:

The average book length for the 53 relevant data points is 45,981 words, with a standard deviation of 13,204 words.

The average query length is 184 words, and the standard deviation is 43 words.

On average, a book has 235 words per query word, and the standard deviation is 76 book words.

The average query has 2.55 paragraphs, and the standard deviation is .95 paragraphs.

The average number of words per paragraph is 78, and the standard deviation is 24 words.

Running a yearly analysis of comps showed nearly universal use from 2014 on. Previous years also were robust, with a low of 3/7 books using comps. Owing to the significant comp use, I did not analyze use against query length.

So if your middle grade query is 300 words and six paragraphs and doesn’t use comps, rethink it. The longest successful middle grade query used 277 words.

--

--

Patrick Hopkins
Patrick Hopkins

Written by Patrick Hopkins

I write mostly data-driven stuff.

No responses yet