Research Articles in Chemistry

I will discuss the text complexity of the four research articles about chemistry mainly in two ways. Firstly, I will explore the text complexity for one of them in detail, including the qualitative, quantitative and reader & tasks aspects. I also proposed my own metric for text complexity measurement in reader and task aspect. Secondly, I will create various graphs to compare all of them in text complexity.

1. Text Complexity for Clark, J. H. (2006). Green chemistry: today (and tomorrow). Green Chemistry, 8(1), 17-21.

The first article I would like to study is Clark, J. H. (2006). Green chemistry: today (and tomorrow). Green Chemistry, 8(1), 17-21. The following three figures are produced by the tool “StoryToolz”.

Figure 1

From the qualitative aspect of review, we can study about the reading levels in details.

As is shown in figure 1, the average grade level is 10.7 for the first research article. This level belongs to 10th to 12th grade, which is fairly difficult to read. It is the mean of Flesch-Kincaid Grade Level, Automated Readability Index, Coleman-Liau, Flesch Reading Ease, Gunning fog index, Laesbarhedsindex (LIX) Formula, SMOG Index. They are similar to each other, but proposed by different researchers. All of them provide readability tests which are designed in a qualitative way of text complexity to evaluate the difficulty of an article for readers to understand.

Figure 2

From the quantitative aspect of review, we can study about the sentence information which includes number of characters, words, sentences, syllables per word and so on. This article has about 2335 words and 11,934 characters, so each word contains approximately 5 characters. In the 177 sentences, 60% of them are short sentences which contain 8 words or less, 27% of them are long sentences with 23 words or more. So, it is kind of hard to read the article with this amount of long sentences, even the longest one has 58 words.

Figure 3

Apart from that, we can also look at the word usage for quantitative measurement of text complexity. The number of prepositions are 277, accounting for 12% of total words. Next are pronouns and conjunctions, with 4% and 6% of words respectively. There is also quite a large number of nominalizations in this article, nominalization means using verbs, adjectives or adverbs as nouns, this kind of process of producing nouns requires readers to have a high level of reading ability.

From the reader and task aspect of view, this research article needs the readers to have a background knowledge in chemistry. It also requires readers to have the ability of understanding and explaining the complex graphics. Besides, there is also a question for the reader, it is “Can we build on them as we have done over the last 70 years with the now well-established petro-platform molecules such as ethene and benzene (Fig. 7)?”

It seems complex to measure the text complexity in the reader and task aspect, so I have proposed a metric for evaluating the complexity with the name R&T Index. I will define and take four factors into consideration, including the background knowledge requirement index, graphs complexity index, formula complexity index and question complexity index. Each one has a grade from 1-5, and they have equal weight of total score, the final score will be the sum of them divided by 20. To be more specific, this article has the background knowledge requirement index of 3, graphs complexity index of 5, formula complexity index of 0 and question complexity index of 3. So the final score of reader & task is (3 + 5 + 0 + 3)/20 = 11/20 = 0.55, which is fairly difficult to read.

Moreover, I have created the table for the metric made by myself (R&T Index) in relation to Flesch–Kincaid readability tests as below. You can compare with the Flesch reading ease and have a better understanding.

Flesch Score R&T Index School level Notes

100-90 0.00-0.20 5th Grade Very easy to read. Easily understood by an average 11-year-old student.

90-80 0.20-0.30 6th Grade Easy to read. Conversational English for consumers.

80-70 0.30-0.40 7th Grade Fairly easy to read.

70-60 0.40-0.50 8th & 9th Grade Plain English. Easily understood by 13- to 15-year-old students.

60-50 0.50-0.70 10th to 12th Grade Fairly difficult to read.

50-30 0.70-0.80 College Difficult to read.

30-0 0.80-1.00 College Graduate Very difficult to read. Best understood by university graduates.

Table 1

2. Comparison

In this part, I will compare the four research articles in different ways. The four research articles are all taken from Green Chemistry:

1. Clark, J. H. (2006). Green chemistry: today (and tomorrow). Green Chemistry, 8(1), 17-21.

2. Jessop, P. G. (2011). Searching for green solvents. Green Chemistry, 13(6), 1391-1398.

3. Polshettiwar, V., & Varma, R. S. (2010). Green chemistry by nano-catalysis. Green Chemistry, 12(5), 743-754.

4. Henderson, R. K., Jiménez-González, C., Constable, D. J., Alston, S. R., Inglis, G. G., Fisher, G., ... & Curzons, A. D. (2011). Expanding GSK's solvent selection guide–embedding sustainability into solvent selection starting at medicinal chemistry. Green Chemistry, 13(4), 854-862.

In the figures below, I use article 1 to 4 to stand for the given four articles in chemistry above.

Figure 4

For qualitative measurement, we can see from figure 4 that article 1, 3 and 4’s average grade level lie in the 10th to 12th grade. It is fairly difficult to read if we just consider the readability. Article 2 is at the 8th grade, which could be easily understood by 13- to 15-year-old students.

Figure 5

Let’s take a look at the quantitative aspect of measurement in figure 5. Obviously, the third article has most short and long sentences, and also 4 questions. Article 1 and 4 are similar to each other, they have a similar number of short sentences, long sentences and questions. Article 3 lies between these three articles. Overall, we could say that article 4 is most complex in quantitative aspect.

Figure 6

For my own metric R&T Index, I have calculate the scores of the four research articles, with 0.55, 0.7, 0.9 and 0.4 respectively. As it shows in figure 6, the third article has the highest score of R&T Index, which is very difficult to read and it could be best understood by university graduates. The fourth article has got only 0.4, fairly easy to read. And for the second and third articles, they get the score in the range of 0.50-0.70, which are fairly difficult to read.

