Online College Math Teacher

Saturday, June 21, 2025

It's a linear world - approximately; and the "Rule of 72"

You are probably familiar with the "Rule of 72" in investing: if an investment compounds at annual interest rate i, then the number of years for money to double is approximately 72 / i. For example, if an investment compounds at annual interest rate of 9%, then the investment will double in approximately 72/9 = 8 years. The compound interest formula confirms this: 1 * 1.09 ⁸ = 1.992 (rounded), which is approximately 2.

For fun, ask your banker or investment advisor why the Rule of 72 works.

It works because in the short run, it's a linear world - approximately. A little more mathematical is that, in a small enough neighborhood, any differentiable function is approximately linear.

For the Rule of 72 we are solving for n in the equation 1 * (1 + i/100) ⁿ = 2. We take natural logarithms of both sides:

1 * (1 + i/100) ⁿ = 2
LN (1 + i/100) ⁿ = LN 2
n * LN (1 + i/100) = LN 2
n = LN 2 / LN (1 + i/100)
n ≈ .693 / LN (1 + i/100)
n ≈ .693 / (i/100)
n ≈ 69.3 / i
n ≈ 72 / i

The key step three lines above is that for small values of x, LN (1 + x) is approximately equal to x. One way to see this is that the Taylor series (sorry, not invented by Taylor Swift) expansion around a = 0 is:

LN (1+x) = x - x ² / 2 + x ³ / 3 - x ⁴ / 4 + ... .

For small x, LN (1+x) is approximately equal to x.

In the following plots, the plot on the left shows that the logarithmic function 1 of y = LN (1 + x) is surely different than the linear function 2 of y = LN (1.05) + (1/1.05)*(x - .05). Function 2 is the equation of the tangent line to function 1 at x = .05. (To derive the tangent line, recall from calculus that if y = LN(1 + x), then dy/dx = 1/(1 + x) ). The plot on the right is of the same two functions, but with the x range shrunk to (0.0, 0.1). In this small range the two functions are indistinguishable. In this small neighborhood, the differentiable function y = LN(1 + X) is indistinguishable from its linear tangent.

Of course a "Rule of 69.3", where LN 2 is .693 to three decimal places, would be a better approximation to the compound interest formula than the "Rule of 72", but 72 is close enough and is useful because 72 has so many integer divisors.

In the real world we often assume linearity holds within a small range. For example in cooking a turkey (not that I have ever done this), one website says to allow about 15 minutes per pound (on average!) at 350°F to cook a stuffed turkey. But their time estimates by pound are not linear, so for example their time estimate to cook a 24-pounder is less than twice as long to cook a 12-pounder. Perhaps a better approximation is a power curve y = a * (x ^b) . Presumably there are some physics considerations that are non-linear.

The Newton-Raphson method, x _n+1 = x _n - f (x _n ) / f ^' (x _n ), for finding approximations to roots of a function, is an example of a mathematical use of linearity. N-R iteratively finds the x-intercept of the tangent of the graph of f. A number of R packages have N-R functions.

And before we end the subject of "it's a linear world - approximately", you can certainly measure the straight line distance between two cities, but if you travel by airplane, the airplane is flying the distance between two points on a sphere. Given the latitude and longitude coordinates and the central angle between the two points, the spherical distance can be calculated by the Haversine Formula.

R makes it easy not to settle for linearity. But for those of us who are not doing something where great precision is required, linearity may be just fine,

Here is the R-code to plot the graphs. Note that in R, log(x) is the natural log function. This is also true in Python and Excel.



library(ggplot2)

common_theme <- theme(
     legend.position="right",
     plot.title = element_text(size=15, face="bold"),
     plot.subtitle = element_text(size=12.5, face="bold"),
     axis.title = element_text(size=15, face="bold"),
     axis.text = element_text(size=15, face="bold"),
     axis.title.y = element_text(angle = 0),
     legend.title = element_text(size=15, face="bold"),
     legend.text = element_text(size=15, face="bold"))

a <- 0.0
b <- 1.0
x_values <- seq(a, b, by = 0.01)
func1 <- function(x) {log(1 + x)}
func2 <- function(x) {log(1.05) + (1/1.05)*(x - .05)}
df1 <- data.frame(x = x_values,
           y = log(1 +x_values),
           func = "Function 1")
df2 <- data.frame(x = x_values,
           y = log(1.05) + (1/1.05)*(x_values - .05),
           func = "Function 2")
combined_df <- rbind(df1, df2)

ggplot(combined_df, aes(x = x, y = y, color = func)) +
   geom_line(size = 1.5) +
   scale_color_manual(values = c("Function 1" = "#E41A1C", 
            "Function 2" = "royalblue4")) +
   ylim(c(0, 1)) + 
   labs(title = "Plot of LN and Linear Functions",
              x = "x",
              y = "y",
              color = "Functions") +
  common_theme

ggplot(combined_df, aes(x = x, y = y, color = func)) +
   geom_line(size = 1.5) +
   scale_color_manual(values = c("Function 1" = "#E41A1C", 
            "Function 2" = "royalblue4")) +
   xlim(c(0, 0.1)) +
   ylim(c(0, 0.11)) + 
   labs(title = "Plot of LN Function and its Tangent Line at x = .05",
              x = "x",
              y = "y",
              color = "Functions") +
  common_theme

End

Sunday, June 15, 2025

Hebrew Gematria in R

Gematria is a Greek word for the practice of assigning numerical values to letters. In Hebrew it has been used to interpret Jewish texts, particularly the Bible, by attempting to discover hidden meanings or connections between words and concepts.

One example of gematria is in Genesis 14:14 when Abram takes 318 men to rescue his relative. The gematria values of Abram's servant Eliezer and of the Hebrew word for speaking are both 318. Perhaps there is more to this passage than just the 318 men. See https://stljewishlight.org/arts-entertainment/understanding-hebrew-numerology-and-the-secrets-of-the-torah/#:~:text=One%20famous%20example%20of%20gematria,connects%20to%20the%20heavenly%20universe .

The table of 22 Hebrew letters and their gematria values is the following. I have omitted the five Hebrew letters that are formed differently when they appear as the last letter of a word, but their values are the same as the corresponding non last letter. Gematria does not use the dots and dashes vowel symbols that appear below the Hebrew letter.

Although gematria is not universally accepted by Jews and Jewish scholars, it is common in Jewish culture to make donations in multiples of $18: 18 is the gematria value of the Hebrew word "chai" (חי) which means "life". ח (chet equals 8) + י (yod equals 10).

There is an R package gemmatria (two MM's), but it is not currently available on CRAN due to encoding issues about non-ASCII characters. It is installable from GitHub using the devtools package with the following command: devtools::install_github("benyamindsmith/gemmatria", force=TRUE) . As an example of the package, the R code for the gematria value of yaakov, יעקב , is

library(gemmatria)
get_gemmatria("יעקב")

which gives 182. This is calculated as the sum (yod = 10) + (ayin = 70) + (qof = 100) + (bet = 2).

One online source that will accept a Hebrew word and calculate its gematria value is https://www.torahcalc.com/tools/gematria .

Although the above website and gemmatria package are sufficient for most uses, I thought it would be instructive to build my own gematria calculator in R.

I began by creating a dataframe of the above table. I found the right-to-left nature of entering 27 Hebrew letters, surrounded by quotation marks and separated by commas, too cumbersome. Instead I used their Unicode hex equivalents, such as א is u05D0 ; print("\u05D0") will print alef, א.

Then given a Hebrew word, I split the word into its individual characters, and I used the match command to look up and sum the gematria values. Note that the Hebrew word must not contain vowels, or else there will be no match.

I also added a little exploration of five Biblical descendants of Abraham and the number 26. 26 has a lot of significance in the Bible.

26 is the gematria value of the four letter unspeakable Hebrew name for G-d: Yod + He + Vav + He = 10 + 5 + 6 + 5 = 26.
26 is the number of generations from Adam to Moses.
26 is the number of generations from David to Jesus according to one geneaology (but interpretations differ).
There are many examples of Biblical gematria values of 26.

26 also has some interesting mathematical properties such as it is the sum of a cube's 6 faces, 12 edges, and 8 vertices, but I will leave such mathematical properties for another time.

For five of the Biblical descendants of Abraham:

Gematria of Isaac יצחק is 208 is multiple of 26.
Gematria of Jacob יעקב is 182 is multiple of 26.
Gematria of Joseph יוסף is 156 is multiple of 26.
Gematria of Ishmael ישמעאל is 451 is not multiple of 26.
Gematria of Esau עשו is 376 is not multiple of 26.

I will leave an interpretation of the above to the reader.

There are many other examples of the use of gematria to help explain the Bible. Also, I have limited this to Standard Gematria; there are other forms.

Is gematria clever wordplay and coincidence? Or were the biblical writers trying to tell us something?

The R code is as follows:


# Define Hebrew letters, names, and gematria numbers
df <- data.frame(
  hebrew = c("\u05D0", "\u05D1", "\u05D2", "\u05D3", "\u05D4",
             "\u05D5", "\u05D6", "\u05D7", "\u05D8", "\u05D9",
             "\u05DA", "\u05DB", "\u05DC", "\u05DD", "\u05DE",
             "\u05DF", "\u05E0", "\u05E1", "\u05E2", "\u05E3",
             "\u05E4", "\u05E5", "\u05E6", "\u05E7", "\u05E8",
             "\u05E9", "\u05EA"),
  
  name = c("Alef", "Bet", "Gimel", "Dalet", "He", "Vav", "Zayin", 
           "Chet", "Tet", "Yod", "Final_Kaf", "Kaf", "Lamed", "Final_Mem", 
           "Mem", "Final_Nun", "Nun", "Samech", "Ayin", "Final_Pe", "Pe", 
           "Final_Tsadi", "Tsadi", "Qof", "Resh", "Shin", "Tav"),
  
  number = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 20, 30, 40, 40, 
             50, 50, 60, 70, 80, 80, 90, 90, 100, 200, 300, 400)
)

# Define words;  no vowels!
yitzchak <- "יצחק"
yaakov <- "יעקב"
yosef <- "יוסף"
ishmael <- "ישמעאל"
esau <- "עשו"


words <- c(yitzchak, yaakov, yosef, ishmael, esau)

gematria <- function(word) {
   # Split word into characters
   v <- strsplit(word, split = "")[[1]]
   # Calculate Gematria of word using vectorized lookup
   gematria_value <- sum(df$number[match(v, df$hebrew)])
   if (gematria_value %% 26 == 0) divisibility = "is multiple of 26" 
     else divisibility = "is not multiple of 26"
  # Output result
  cat("Gematria of", word, "is", gematria_value, divisibility, "\n")
}

for (word in words) {
  gematria(word)
}

End

Monday, June 2, 2025

LGBTQ+ Pride Month

This is the start of Pride month. I have friends, acquaintances, and family members in most if not all LGBTQ+ categories, and probably you do too although you may not know it. I wish they all could live lives of being who they want to be without being socially or legally hassled. I made this flag in computer language R.

Here is the R code:

 # Draw empty plot
plot(NULL, col = "white", xlim = c(0, 12), ylim = c(0, 8), xlab="", xaxt="n", ylab="", yaxt="n", 
    main="LGBTQ+ Rainbow Flag" ) 

polygon(x = c(3.33, 12, 12, 2), y = c(6.67, 6.67, 8, 8), col = "#e40303")  # coordinates counter-
    clockwise; red                  
polygon(x = c(4.67, 12, 12, 3.33), y = c(5.33, 5.33, 6.67, 6.67), col = "#ff8c00") # orange
polygon(x = c(6, 12, 12, 4.67), y = c(4, 4, 5.33, 5.33), col = "#ffed00") # yellow
polygon(x = c(4.67, 12, 12, 6), y = c(2.67, 2.67, 4, 4), col = "#008026") # green
polygon(x = c(3.33, 12, 12, 4.67), y = c(1.33, 1.33, 2.67, 2.67), col = "#004dff") # blue
polygon(x = c(2, 12, 12, 3.33), y = c(0, 0, 1.33, 1.33), col = "#750787") # purple

polygon(x = c(1, 2, 6, 2, 1, 5), y = c(0, 0, 4, 8, 8,4), col = "#000000") # black
polygon(x = c(0, 1, 5, 1, 0, 4), y = c(0, 0, 4, 8, 8, 4), col = "#613915") # brown
polygon(x = c(0, 0, 4, 0, 0, 3), y = c(1.33, 0, 4, 8, 6.67, 4), col = "#74d7ee") # light blue
polygon(x = c(0, 0, 3, 0, 0, 2), y = c(2.67, 1.33, 4, 6.67, 5.33, 4), col = "#ffafc8") # pink
polygon(x = c(0, 0, 2), y = c(5.33, 2.67, 4), col = "#ffffff") # white

End

Monday, May 26, 2025

Self-intersecting Quadrilateral

A quadrilateral is a polygon having four sides, four angles, and four vertices. A polygon means that the figure is a closed shape, meaning the last line segment connects back to the first one, effectively enclosing an area.

We usually think of quadrilaterals as squares, rectangles, parallelograms, trapezoids, rhombuses, or kites. (I was impressed that my four year-old granddaughter knew the last one, although she called it a diamond!) It could also be irregularly shaped with no name.

However, a polygon may intersect itself. A five-sided star is one example, where the sides are connected to alternating vertices.

A quarilateral may also intersect itself. In the following diagram, the original quadrilateral has points A (0,0), B (4,0), C (3,3), D (1,4). The self-intersecting quadrilateral is formed from the original quadrilateral by shifting point D from (1,4) to (2, -2), so side CD crosses AB).

This self-intersecting quadrilateral is still four-sided and closed, so it is no less a quadrilateral than the original.

Here is some R code:


# Self-intersecting quadrilateral
library(ggplot2)

# Define coordinates for original quadrilateral
original_quad <- data.frame(
  x1 = c(0, 4, 3, 1),
  y1 = c(0, 0, 3, 4),
  x2 = c(4, 3, 1, 0),
  y2 = c(0, 3, 4, 0),
  group = c("AB", "BC", "CD", "DA"),
  labels = c("A", "B", "C", "D")
)

# Define coordinates for self-intersecting quadrilateral
# Shift point D from (1,4) to (2, -2), so side CD crosses AB)

stretched_quad <- data.frame(
  x1 = c(0, 4, 3, 2),  
  y1 = c(0, 0, 3, -2), 
  x2 = c(4, 3, 2, 0),
  y2 = c(0, 3, -2, 0),
  group = c("AB", "BC", "CD", "DA"),
  labels = c("A", "B", "C", "D")
)

# Define colors for each side
color_map <- c("AB" = "red", "BC" = "blue", "CD" = "green", "DA" = "purple")

# Function to plot the quadrilateral
plot_quad <- function(data, title, x_lim, y_lim) {
  ggplot(data) +
    geom_segment(aes(x = x1, y = y1, xend = x2, yend = y2, color = group), size = 1.5) +  # Draw each side
    geom_point(aes(x = x1, y = y1), size = 3, color = "black") +  # Show points
    geom_text(aes(x = x1, y = y1, label = labels), vjust = -1, hjust = -0.5, size = 6, fontface = "bold") +  # Label A, B, C, D
    scale_color_manual(values = color_map) +
    coord_fixed() +
    xlim(x_lim[1], x_lim[2]) +
    ylim(y_lim[1], y_lim[2]) +
    theme_minimal() +
    ggtitle(title)
}

# Expanded limits for full visibility
x_range <- c(-1, 7)  
y_range <- c(-3, 7)  

# Plot the original quadrilateral
p1 <- plot_quad(original_quad, "Original Quadrilateral", x_range, y_range)

# Plot the self-intersecting quadrilateral
p2 <- plot_quad(stretched_quad, "Self-Intersecting Quadrilateral", x_range, y_range)

# Display both plots
library(gridExtra)
grid.arrange(p1, p2, ncol = 2)

End

Monday, March 31, 2025

April Fool's Day, and the Pythagorean Theorem

Here is a post in honor of April Fools' Day, which unlike March 14, is celebrated by countries that use either the month-day format or the day-month format.

Not everyone is a math person, but nearly everyone remembers the Pythagorean Theorem.

Well, not everyone. The Scarecrow got it wrong in the 1939 movie, "The Wizard of Oz". It is arguable whether the writers accidentally or deliberately got it wrong, or whether the actor Ray Bolger flubbed the line. (See also Singh, "The Simpsons and Their Mathematical Secrets", pp. 119-121.)

So here is a little Pythagorean Theorem problem. In the following right triangle, calculate C. A² + B² = C², and then take the square root of C. How hard can it be?

Of course you have to know if A = (√-1), then A² = (√-1)² = -1, but that seems reasonable.

But does something seem wrong with the resulting value of C?

I plotted the triangle in a complex number plane, not a real number plane. (√-1) is not a "real" number, and perhaps it should be thought of as a non-real number. (I am intentionally avoiding the dreaded "i" word.) Perhaps what is happening is that in the complex plane, the hypotenuse C is really a line segment of magnitude zero, and hence a point? No, not "really".

So April Fool's. You calculated a hypotenuse of a right triangle and concluded it has length zero.

A better answer is that a point (x, y) in the complex plane is represented as x + yi, and the Euclidean distance between two complex points (x, y) and (u, v) equals √( (u - x)² + (v - y)² ) which here is the expected √2.

Here is the R code for the plot, followed by the Python code for the plot. Note that in R, the shading is done with geom_polygon, while in Python it is done with plt.fill_between:


library(ggplot2)

# Define the vertices of the triangle
triangle_data <- data.frame(
    x = c(0, 0, 1),    # x-coordinates of vertices
    y = c(0, 1, 0)     # y-coordinates of vertices
)

ggplot(data = NULL) +
    geom_polygon(data = triangle_data, aes(x = x, y = y), fill = "#9B111E", color = "black") +
    geom_segment(aes(x = 0, y = 0, xend = 1, yend = 0), color = "black", size = 2) +
    geom_segment(aes(x = 0, y = 0, xend = 0, yend = 1), color = "black", size = 2) +
    geom_segment(aes(x = 0, y = 1, xend = 1, yend = 0), color = "black", size = 2) +
    ggtitle("Use the Pythagorean Theorem to calculate C") +
    xlab(expression(bold(A == sqrt(-1)))) +  # Square root symbol for x-axis label
    ylab("B = 1") +     # Label for y-axis
    geom_text(aes(x = 0.55, y = 0.55, label = "C = ?"), fontface = "bold", size = 6) +
    theme_classic() +
    theme(
        legend.position = "none",
        axis.line = element_blank(),
        axis.text = element_blank(),
        axis.ticks = element_blank(),
        axis.title.x = element_text(size = 20, face = "bold"),
        axis.title.y = element_text(size = 20, face = "bold"),  
        plot.title = element_text(size = 20, face = "bold")
    )



import numpy as np
import matplotlib.pyplot as plt

# Create figure and axes
fig, ax = plt.subplots()
ax.set_title('Use the Pythagorean Theorem to calculate C', fontsize=15, fontweight='bold')

# Define the points for the line
x = [0, 1]
y = [1, 0]

# Plot the horizontal and vertical lines
ax.axhline(y=0, color='black')  # x-axis
ax.axvline(x=0, color='black')  # y-axis

# Plot the diagonal line
ax.plot(x, y, color='black', linewidth=1.5)

# Fill the region enclosed by the axes and the line
plt.fill_between(x, y, 0, color='#9B111E', alpha=0.5)

# Add axis labels with bold and enlarged fonts
ax.text(0.5, -0.05, r"$\mathbf{A = \sqrt{-1}}$", ha='center', va='center', fontsize=14, transform=ax.transAxes)
ax.text(-0.05, 0.5, r"$\mathbf{B = 1}$", ha='center', va='center', fontsize=14, rotation=90, transform=ax.transAxes)

# Add diagonal line label with bold text, enlarged font, and moved farther off the line
ax.text(.6, .5, r"$\mathbf{C}$", ha='center', va='center', fontsize=14, color='black')

# Turn off the axes' tick marks and spines
ax.spines['top'].set_color('none')
ax.spines['right'].set_color('none')
ax.spines['bottom'].set_color('none')
ax.spines['left'].set_color('none')
ax.tick_params(left=False, bottom=False, labelleft=False, labelbottom=False)

# Display the plot
plt.show()

# end

Friday, March 14, 2025

Pi and billiard balls; a different application of π for Pi Day

So suppose there is some sort of physical experiment, and the first time you do it, the answer comes out as 3. Then you change the experiment to make it a little more complicated, and the answer comes out as 31. Then you make it even more complicated, and the answer comes out 314. Then you ... . Of course you see where I'm going with this.

It's March 14, or 3 14 (in the US date format), and because 314 are the first digits of π, many people use this day to share interesting and unusual appearances of π. I think you will find Pi and the Billiard Balls to be a little different!

But first, if your statistics course did not require calculus as a prerequisite, you may be unaware that π is contained within the formula for the probability density of the Normal distribution, f(x) =

\frac{1}{\sqrt{2 π σ^{2}}}^{\exp - \frac{{(x - μ)}^{2}}{2 σ^{2}}}

where the 2π is necessary for the integral of the pdf to equal 1.

A common application of π on Pi Day is the Buffon's needle problem: Given a needle of length l dropped on a plane ruled with parallel lines t units apart, what is the probability that the needle will lie across a line upon landing? There are many references to this such as Buffon, so the solution will not be repeated here.

π appears in many places in math. One reason is that π is defined in reference to a circle. The trigonometric functions can be defined in terms of triangles within a circle. As a point traverses the circumference of the circle more than once, the trigonometric functions repeat in cycles of 2𝜋. Therefore, phenomena that repeat and can be represented by trigonometric functions are likely to have π somewhere within them. This is because π inherently relates to the periodic nature of these functions, making it indispensable in modeling cyclical behaviors.

I recently discovered a physics problem called Pi and Billiard Balls. The original article is by G. Galperin, and I will try to summarize it. His paper is not an easy read.

Suppose we have the first quadrant of an xy coordinate system. Suppose we have a vertical wall at x = 0, y ⪰ 0. We have ball 1 having mass m1 at initial position x1 > 0, and ball 2 having mass m2 ⪰ m1 at initial position x2 > x1. Let the ratio of the masses be a multiple of 100: m2 / m1 = 100^N for a fixed non-negative integer N, including N = 0. Suppose ball 2 moves from right to left along the x-axis and collides with ball 1, ball 1 moves from right to left and collides with the vertical wall, and assume all collisions will be perfectly elastic. This perfect elasticity assumption implies the balls will satisfy the law of conservation of momentum, and the law of conservation of kinetic energy. We also assume the balls only move along the x-axis; there is no y movement.

The amazing conclusion of all this is: The total number of collisions C(N) is a number equal to the first N decimal digits of the number π (starting with 3) !

For case 1, let N = 0 so m2 = m1. Now push ball 2 from right to left at initial velocity v2 until it hits ball 1. This is collision 1. Ball 1 will move from right to left at the same velocity v1' = v2, while ball 2 will now be at rest. Eventually ball 1 hits the wall for collision 2 and now moves from left to right at velocity -v1', until it hits ball 2 for collision 3. Now ball 1 is at rest, and ball 2 is moving to the right with velocity −v1'. There have been 3 collisions (the first digit of π), and there will be no more.

For case 2, let N = 1 so m2 / m1 = 100. This is more complicated because after collision 1, v1' will not equal v2, and ball 2 will not be at rest. The result of the two conservation equations m1v1 + m2v2 = m1v1' + m2v2' and .5m1(v1^2) + .5m2(v2^2) = .5m1(v1' ^2) + .5m2(v2' ^2), give v1' = [(m1 - m2)v1 + 2m2v2] / (m1 + m2) and v2' = [(m2 - m1)v2 + 2m1v1] / (m1 + m2). After collision 1, substituting v1 = 0 and m2 / m1 = 100 gives v1' = 200v2/101 ≈ 1.98v2 (nearly twice as fast as initial velocity v2) and v2' = 99v2/101 ≈ .98v2 (slightly less than initial velocity v2).

Ball 1 hits the wall for collision 2 and now moves from left to right at velocity -v1', until it hits ball 2 for collision 3.

After collision 3, substituting primes into the two conservation equations, gives v1''' ≈ -0.96v2 and v2''' ≈ 1.96v2. Ball 1 will bounce back and forth between ball 2 and the wall many times. After each collision between the two balls, the velocity of each ball changes with ball 1 decreasing in velocity and ball 2 increasing in velocity. The relative velocity between ball 1 and ball 2 decreases with each collision, as the heavier ball will slowly transfer its momentum to the lighter ball. Eventually the relative velocity will be so small that they will effectively move together after the collision. At this point, no more collisions will occur. There will be 31 collisions (the first two digits of π).

For case 3, let N = 2 and m2 / m1 = 100², there are 314 collisions. And so on.

Not surprisingly there is a circle lurking under all of this, due to the conservation of kinetic energy equation .5m1(v1^2) + .5m2(v2^2) = constant, and there is a trigonometric function and a calculation involving an angle and π. Galperin proves the conclusion in general: The total number of collisions C(N) is a number equal to the first N decimal digits of the number π (starting with 3) !

In addition to the Galperin paper, a good explanation is here, but this is not an easy read.

You are welcome to try this experiment yourself, if you can create the condition of perfect elasticity. Happy Pi Day.

Thursday, January 30, 2025

The Mathematics of Taylor Swift

I confess to being fascinated by Taylor Swift for far more than her music. I think she is an extraordinary person for her philanthropy, her speaking out for victims of sexual assault, her advocacy of artists' ownership rights, and her urging her fans to vote.

But of course there's much more, and so let's look at the mathematics of Taylor Swift.

Let's start with her net worth. She is estimared to have $1.6 billion in assets. Her largest asset is her music catalog. She did not own the masters (first recording) to her older music, but she re-recorded them and owns the masters to the re-recorded versions and to her future recordings.

She donates large sums to a variety of organizations. Most of these donations are not public, but here is a sample of some recent donations:

      There are three parts to this mathematical analysis:
         1. her attractiveness,
         2. her lipstick shades, and
         3. her lyrics.
Each part contains a considerable amount of imprecision in measuremenys and judgments. This is more of a work in progress than the final word. Nevertheless, I think it is a non-routine use of math, and it was a lot of fun.

1. How pretty is she?

You can have your opinion. But mine is based on math.

The ancient Greeks discovered a particular number called the Golden Ratio, denoted by Greek letter Φ (phi), that has many interesting mathematical properties, apppears in some patterns of nature, and is considered by many to be asthetically pleasing. The Golden Ratio results from finding the point on a line segment that splits the segement into two smaller segments with lengths a and b, such that (a + b)/a = a/b.

That ratio a/b is the Golden Ratio, Φ. With a little algebra, Φ = (1 + √5)/2 , which is an irrational number so it has an infinite non-repeating decimal, and rounded to three decimal places is 1.618.

Renaissance artists, plastic surgeons, and makeup artists are among those who use Golden Ratios in various ways with faces to create ideally proportioned faces. Gary Meisner has wriiren extensively on the Golden Ratio, and he believes there are over 20 different ways that the Golden Ratio shows up in human faces and that “the Golden Ratio is also found very commonly in beautiful models of today across all ethnic groups". Biostatistican professor Dr. Kendra Schmid and her colleagues performed various measures of many faces. They began with 17 potential Golden Ratios, and they decided only six of these ratios were predictors of facial attractiveness. See Schmid.

This takes us to Taylor. I attempted to measure these six ratios on a picture of Taylor. There are many pictures of her, she does enjoy experimenting with different hairstyles, and I had to find one with a hairstyle that gave me the best chance of measuring her from her hairline and also between her ears. The measurement is not exact for many reasons, and because we are using a two-dimensional photo of a three-dimensional object there is certainly some loss of accuracy. Nevertheless, here are the results:

Face length / Face Width	1.4722
Mouth width / Interocular distance	1.4000
Mouth width / Nose width	1.5556
Lips to chin / Interocular	1.7000
Lips to chin / Nose width	1.8889
Ear length / Nose width	1.5556
Average	1.5954
% Deviation from Φ	- 1.4%

On average, Taylor's measurements are quite close to phi, the ideal measurement of facial attractiveness.

(Note that I did the measurements manually, and they are inexact. I understand there are Python libraries such as dlib that can detect facial ladmarks, and this might eliminate some of the imprecision.)

2. What lipstick shade is she wearing?

A common question on the Internet is what are her favorite lipsticks. I think more interesting mathematically is: given a particular photo of Taylor, can we get the computer to mathematically identify her lipstick?

First, a little background about image files on the computer. A computer image is a collection of tiny dots called pixels. Each dot is about 1/96 inch. Each pixel contains a color code, as a 6-digit base 16 (hex) number, or as a 6 digit base 10 red-green-blue ordered triple. For example 861A23 in base 16 equals (134, 26, 35) in rgb base 10. The first two hex digits, 86, is the red. 86 in base 16 equals 8*16¹ + 6*16⁰ = 134 in base ten. The second two hex digits, 1A, is the green, etc.

We can ask how similar two colors are by plotting them in 3D and calculating their distance.

D = √ [ (134 – 174)² + (26 – 31)² + (35 – 61)² ] = 48.0

I started with five different photos of Taylor, where it appears to me she is wearing a different shade in each. I tried to crop each photo as a rectangular area as close as I could get to just her lips, and I saved each cropped rectangle as a file. I used R packsge colouR to find the most frequent colors. This was unsatisfactory, possibly because there was too much background color noise. I found a website https://www.remove.bg/ that removes background, and I retried colouR with the background removed files.

I did the same process with online swatches of ten lipsticks that the Internet says are among Taylor's favorites.

To my surprise, there were more than 50 different colors, most of them reasonably close, on both the Taylor photos and on the lipstick swatches.

Why are there so many colors?

there are 16⁶ = 16,777,216 different computer colors
women tell me lipstick never looks same on a person as in a tube
my dermatologist offered that lips have lines and grooves that create uneven coloring
it is hard to isolate the image of just lips in a rectangle
lighting creates distortions
there are image resolution and quality issues

Nevertheless, I decided to do the following. For each of Taylor's ten favorite lipsticks, I found the ten most frequent colors. For each of the five Taylor lip photos, I found their ten most frequent colors. Then for each photo, I calculated the average color distance from the photo to each lipstick. The lipstick with the minimum distance from the photo was then the computer forecast matching a photo to a lipstick. Get it?

Here are the five Taylor photos with the computer's match:

In summary, the computer thinks photo 1 has the minimum distance, 320, to its match. Do you agree this is the best match?

Photo	Shade	Distance
1	Ruby Woo	320
2	Elson 4	625
3	Blood Lust	388
4	Flame	392
5	Flame	399

3. Which of her songs are emotional?

The R package ‘taylor’ contains lyrics for 11 albums and 240 songs. I will examine TS songs for their emotional content, as follows:

% emotional words per song
most frequent words per song
trend in emotions over time
examine theory that 5th song in every album is most emotional

What makes a song emotional? Some possibilities are: emotional words, volume, tempo. pitch, rhythm, and instrumentation. A few musicians I spoke to have a longer list. But I don't have the data to measure any of these, except emotional words.

Linguists have created lists of emotional words. Mohammad and Turney created the NRC Emotion Lexicon list of over 14,000 English words and their associations with ten basic emotions (anger, anticipation, disgust, fear, joy, negative, positive, sadness, surprise, and trust). A word may have more than one emotion. For example, faith has the emotions anticipation, joy, positive, and trust.

Like most datasets, Taylor Swift songs required a considerable amount of data cleaning. In a text analysis like this, data cleaning includes removing punctuation and capitals, deleting stop words (a, oh, the, etc.) that add no value, and fixing informal speech and colloquialisms (dyin’, gonna, etc.). Taylor uses a large number of colloquialisms, often by dropping an ending g in an "ing" word. Also part of the cleaning is that words are stemmed so that love, lovable, loving, and lovely all reduce to "lov"; and then the stemmed words are lemmatized and returned to their root word love. The R packages SnowballC and textstem perform stemming and lemmatizing respectively. However, this combination did not always produce a valid lemmatized word, and I manually adjusted about 500 words which I placed into a file lem_replacements_df.csv. The final result is over 23,000 words from her 240 songs, which I am calling processed words.

For all 240 songs combined, I calculated Taylor's most frequent words, her most frequent emotional words, her number of words for each of the 10 emotions, and the ratio r = number of emotional words / number of processed) words. The results are contained in the following three plots:

The above is a baseline for all songs combined. The r value is .364. I asked a friend whose teen-aged daughter is a Swiftie, to suggest a few Taylor songs that she considers especially emotional, and I created the same types of plots for some of these songs. I'll share the plots for the song "My Tears Ricochet", which had a greater than average r value:

Note that in the above song, the counts are pretty small. Perhaps this is one explanation for why a number of songs that we thought were emotional did not have a higher than average r value.

Another question I was curious about is whether Taylor's songs have become more emotional over time. I am using original release date of album as the time variable. For songs that Taylor re-recorded, I am using the original release date under the assumption that her re-release may have changed the music but not the lyrics.

That her songs are becoming more emotional over time appears not to be true. To the contrary, the slope of a poor fitting trend line is negative, but its F-statistic says the linear model is not significant.

Finally, there is a theory that every fifth song on an album is intentionally more emotional than the others.

This theory is discussed on the Internet and even Taylor gives it some credence. Albums by release year provide 10 data points (there are 11 albums, but two were released in the same year). In 6 years out of 10 the track 5 r ratio exceeded the all other tracks, and in 4 years out of 10 they did not. A two-sample t-test with one tail concludes that the track 5 mean is not significantly greater than the all other tracks mean, p = .4591. Note that track 5 has an outlier, “All you had to do was stay,” which sounds like it should have a high r, but its r was .134.

4. Bonus: &$!#%

To the Taylor Tots parents: Our Taylor has been known to use an R-rated word in her songs, or two. Here is the frequency of some of her most frequent such words (and there are others), mostly from her Tortured Poets Department album:

As I mentioned earlier, each of the three parts of this analysis - her attractiveness, her lipstick shades, and her lyrics - contains various imprecisions and judgments. Perhaps some day I will go back to this project and make some improvements. Comments are welcome. But I think it was a non-routine use of math. And it was fun!

This project uses three R script files and a number of image files. The R code is shown below, but it is long. Both the R code files and the image files may be found at https://github.com/fcas80/The-Taylor-Swift-Project/tree/main


# file 1 of 3: lovely.txt

library(png)
library(ggplot2)
library(grid)

image_path <- "my_path/ts1_phi_removebg.png"
img <- readPNG(image_path, native=TRUE)   # width 584, height 457
height <- nrow(img)   # 383
width <- ncol(img)    # 331
raster_img <- grid::rasterGrob(img, interpolate = TRUE)
df <- data.frame(xpos = c(0, width), ypos = c(0, height))

# 1. plot photo on minimum grid
ggplot(data = df,
       aes(xpos, ypos)) +
  xlim(0, width) + ylim(0, height) +
  geom_blank() +
  annotation_custom(raster_img, xmin=0, xmax=width, ymin=0, ymax=height) + 
  theme(axis.title.x = element_blank(), 
      axis.title.y = element_blank())

# 2. plot photo on more detailed grid; measurements are manual and imprecise
ggplot(data = df,
       aes(xpos, ypos)) +
  xlim(0, width) + ylim(0, height) +
  geom_blank() +
  annotation_custom(raster_img, xmin=0, xmax=width, ymin=0, ymax=height) +
  geom_hline(yintercept = seq(0, height, by = 10), color = "gray", linwidth = 0.5) +
  geom_vline(xintercept = seq(0, width, by = 10), color = "gray", linwidth = 0.5) +
  scale_x_continuous(breaks = seq(0, width, by = 20)) +
  scale_y_continuous(breaks = seq(0, height, by = 20)) +
  annotate("segment", x = 90, y = 265, xend = 90, yend = 0, color = "red", linwidth = 3) +
  annotate("segment", x = 70, y = 90, xend = 250, yend = 90, color = "red", linwidth = 3) +
  annotate("segment", x = 130, y = 170, xend = 180, yend = 170, color = "red", linwidth = 3) +
  annotate("segment", x = 135, y = 105, xend = 180, yend = 105, color = "red", linwidth = 3) +
  annotate("segment", x = 125, y = 70, xend = 195, yend = 70, color = "blue", linwidth = 3) +
  annotate("segment", x = 160, y = 85, xend = 160, yend = 0, color = "blue", linwidth = 3) +
  annotate("segment", x = 50, y = 160, xend = 50, yend = 90, color = "red", linwidth = 3) +
  theme(axis.title.x = element_blank(),
        axis.title.y = element_blank())

segments_df <- data.frame(
  segment_name = c("face_length", "face_width", "dist_bet_eyes", 
                   "nose_width", "mouth_width", "lips_to_chin", "ear_length"),
  x = c(90, 70, 130, 135, 125, 160, 50),
  y = c(265, 90, 170, 105, 70, 85, 160),
  xend = c(90, 250, 180, 180, 195, 160, 50),
  yend = c(0, 90, 170, 105, 70, 0, 90))
segments_df$dist <- sqrt((segments_df$x - segments_df$xend)^2 + (segments_df$y - segments_df$yend)^2)
segments_df

ratios_df <- data.frame(
  ratio_name = c("face length / width", "mouth width / interocular", "mouth width / nose width", 
       "lips to chin / interocular", "lips to chin / nose width", "ear length / nose width"),
  ratio = rep(0, times=6))

ratios_df$ratio[1] <- round(segments_df$dist[1] / segments_df$dist[2], 4)   # face length / width
ratios_df$ratio[2] <- round(segments_df$dist[5] / segments_df$dist[3], 4)   # mouth width / interocular
ratios_df$ratio[3] <- round(segments_df$dist[5] / segments_df$dist[4], 4)   # mouth width / nose width
ratios_df$ratio[4] <- round(segments_df$dist[6] / segments_df$dist[3], 4)   # lips to chin / interocular
ratios_df$ratio[5] <- round(segments_df$dist[6] / segments_df$dist[4], 4)   # lips to chin / nose width
ratios_df$ratio[6] <- round(segments_df$dist[7] / segments_df$dist[4], 4)   # ear length / nose width 
ratios_df
m <- round(mean(ratios_df$ratio),3)
m
error <- round((m - (1+sqrt(5))/2)/m,3)
error
# END


# file 2 of 3: lipsticks.txt


setwd("my_path")
library(colouR)
library(ggplot2)

# These shades were determined from various Internet articles
shade <- c("Ruby Woo", "Morocco", "Dragon Girl", "Elson 4", "Always Red", "Kyoto Red", 
           "Eadie Scarlet", "Blood Lust", "Flame", "Red to Go")

# These swatches of shades were copied from various Internet stores
df_lipsticks <- data.frame(shade,
        file = c("Mac_Ruby_Woo.png", "NARS_morocco.png", "NARS_Dragon_Girl.png", "PatMcGrath_Elson4.png", 
           "Sephora_always_red.png", "Tatcha_Kyoto_Red.png", "Gucci_Velvet_Eadie_Scarlet.png", 
           "PatMcGrath_Blood_Lust.png", "TomFord_flame.png", "Armani_red_to_go.png" ))

# These files were created by copying various photos of TS, cropping rectangle of lips, then
# using https://www.remove.bg/ to remove backgrounds  
file = c("ts1_lips-removebg-preview.png", "ts2_lips-removebg-preview.png",
       "ts3_lips-removebg-preview.png", "ts4_lips-removebg-preview.png","ts5_lips-removebg-preview.png")
name = c("TS photo 1", "TS photo 2", "TS photo 3", "TS photo 4" ,"TS photo 5")
df_photos <- data.frame(file, name)

############

# p10 for ten lipstick shades, but some only have 1 color
p10 <- function(image, name) {
  # Get primary color
  primary_color <- colouR::getTopCol(image, n = 1, avgCols = FALSE, exclude = TRUE)
  
  top10 <- tryCatch(
    {
      colouR::getTopCol(image, n = 10, avgCols = FALSE, exclude = TRUE)
    },
    error = function(e) {
      primary_color
    }
  )
  
  if (nrow(top10) < 10) {
    top10 <- rbind(top10, top10[rep(1, 10 - nrow(top10)),])
  }
  
  plot <- ggplot(top10, aes(x = hex, y = freq, fill = hex)) +
    geom_bar(stat = 'identity') +
    scale_fill_manual(values = top10$hex) +
    labs(title = paste(name, "Top 10 colors by frequency")) +
    xlab("HEX color code") + ylab("Frequency") +
    theme(
      legend.position = "none",
      plot.title = element_text(size = 15, face = "bold"),
      axis.title = element_text(size = 15, face = "bold"),
      axis.text.x = element_text(angle = 45, hjust = 1, size = 12, face = "bold")
    )
  print(plot)
  return(top10)
}   # close p function

shade_dataframe <- data.frame(matrix(ncol = 10, nrow = 0))

for (i in 1:10) {
  result <- p10(df_lipsticks$file[i], df_lipsticks$shade[i])
  
  # Repeat the first color if there are fewer than 10 colors
  if (nrow(result) < 10) {
    result <- rbind(result, result[rep(1, 10 - nrow(result)),])
  }
  
  shade_dataframe <- rbind(shade_dataframe, t(result$hex))
}

colnames(shade_dataframe) <- paste0("Color", 1:10)
print(shade_dataframe)

#############

p5 <- function(image, name) {
   top10 <- colouR::getTopCol(image, n = 10, avgCols = FALSE, exclude = TRUE)
   return(top10)
}

df_ts_colors <- data.frame()
for(i in 1:nrow(df_photos)){
   top10_colors <- p5(df_photos$file[i], df_photos$name[i])
   top10_colors <- top10_colors$hex
   df_ts_colors <- if(i == 1) {
     df_ts_colors <- top10_colors
    } else {rbind(df_ts_colors, top10_colors)
   }
}
colnames(df_ts_colors) <- c("color1","color2","color3","color4","color5",
                         "color6","color7","color8","color9","color10")
rownames(df_ts_colors) <- df_photos$name
df_ts_colors <- t(df_ts_colors)
print(df_ts_colors)

# begin total RGB distance of a TS photo to centroid of lipstick shade

# sample 3d plot
library(scatterplot3d)
m <- data.frame(
     x = c(174,134),
     y = c(31,26),
     z = c(61,35))

s3d <- scatterplot3d(m$x, m$y, m$z, pch = 19, color = "blue", 
    xlim = c(100, 200), ylim = c(0, 50), zlim = c(0, 100), 
    xlab = expression(bold("R-axis")), ylab = expression(bold("G-axis")), zlab = expression(bold("B-axis")), 
    main = "Plot of two colors in RGB space", cex.main = 2)

s3d$points3d(m$x, m$y, m$z, type = "l", lty = 1, lwd = 2, col = "red")

# Use 'text' for adding labels with bold font
text(s3d$xyz.convert(m$x, m$y, m$z), labels = paste0("(", m$x, ",", m$y, ",", m$z, ")"), 
     pos = 3, cex = 2, col = "#00008B", font = 2)

A <- shade_dataframe$Color1
B <- shade_dataframe$Color2
C <- shade_dataframe$Color3
D <- shade_dataframe$Color4
E <- shade_dataframe$Color5
F <- shade_dataframe$Color6
G <- shade_dataframe$Color7
H <- shade_dataframe$Color8
I <- shade_dataframe$Color9
J <- shade_dataframe$Color10


hex_to_rgb <- function(hex) {
  rgb <- t(col2rgb(hex))
  return(data.frame(r=rgb[1,], g=rgb[2,], b=rgb[3,]))
}

# Function to calculate Euclidean distance between two RGB vectors
euclidean_distance <- function(rgb1, rgb2) {
  sqrt(sum((rgb1 - rgb2)^2))
}

# Function to calculate centroid of a vector of hex colors
calculate_centroid <- function(hex_colors) {
  rgb_values <- hex_to_rgb(hex_colors)
  centroid <- colMeans(rgb_values)
  return(centroid)
}

# Calculate centroids of shades, e.g., centroid_A <- calculate_centroid(A)
  centroids <- lapply(list(A, B, C, D, E, F, G, H, I, J), calculate_centroid)

for (j in 1:nrow(df_photos)) {
   x <- df_ts_colors[,j]
   centroid_x <- calculate_centroid(x)
# Calculate total distance of x to each centroid
   total_distance_to_centroid <- function(x, centroid) {
     x_rgb <- hex_to_rgb(x)
     total <- 0
     for (i in 1:nrow(x_rgb)) {
       total <- total + euclidean_distance(x_rgb[i, ], centroid)
     }
     return(total)
   }

  distances <- sapply(centroids, total_distance_to_centroid, x = x)
  w <- which(distances == min(distances))
  most_similar <- shade[w]
  cat("The lipstick most similar to", df_photos$name[j], "is:", most_similar, 
      "with distance of ", min(distances), "\n")
 }   #  end for loop on photos
  
# END


file 3 of 3:  lyrics.txt

library(taylor)
library(dplyr)
library(tidyverse)
library(tidytext)
data(stop_words)
library(textclean)
library(SnowballC) # For stemming
library(textstem) # For lemmatization

######################################
# functions, preliminaries

# function to print more rows
more_rows <- function(filename){
  options(tibble.print_max = nrow(filename), tibble.print_min = nrow(filename))
  print(filename)
}

# function to fix informal speech (colloquialisms); also contractions, punctuation, lower case
colloq <- function(df) {
  pattern <- "ahh|Ah-uh|ahah|eh|haah|haha|iiii|Oh|oh|Oh-oh|oh-oh|ooh-oh|oohah|Uh|Uh-oh|uh-oh-oh-oh|ya|La-la-la|lala|la-la-la|la-la-la-la|Mm-mm|Mmm-mm|mm-mm|mm-mm-mm-mm|Ha-ha-ha"
  df <- df %>%
    mutate(lyric = str_remove_all(lyric, paste0("\\b(", pattern, ")\\b"))) %>%
    mutate(lyric = str_replace_all(lyric, 
               c("ain't" = "is not", 
                 "Beggin'" = "begging", "beggin'" = "begging", "birth right" = "birthright", "blood-soaked" = "blood soaked",
                 "'bout" = "about", "burnin'" = "burning", "callin'" = "calling", "'Cause" = "because", "'cause" = "because", "Cept" = "Except", "cursee" = "cure",
                 "convers" = "conversation","crashin'" = "crashing", "doin'" = "doing", "driftin'" = "drifting", "dyin'" = "dying", "'em" = "them", "feelin'" = "feeling", "flyin'" = "flying","feverishlying" = "feverishly", "'fore" = "before", "'Fore" = "before", "foreyesgn" = "foreign",
                 "everythi-i-ing" = "everything","fuckin'"="fuck", "gettin'" = "getting", "gonna" = "going to", "gotta" = "got to", "happ'nin'" = "happening", "haven't" = "have not", "Holdin'" = "holding", "hero's" = "hero",
                 "Hopin'" = "hoping", "hopin'" = "hoping","I'ma" = "I am going to", "kinda"="kind of", "king's" = "king", "keepin" = "keeping", "Laughin'" = "laughing", "lookin'" = "looking", "losin" = "losing", "losingg" = "losing", "lovin'" = "loving", "lucki" = "lucky", "no-one" = "no one", "Nothin'" = "nothing", "nothin'" = "nothing","one-hand" = "one hand",
                 "mornin'" = "morning", "'nother"="another", "nothin'" = "nothing", "pickin'" = "picking", "post-mortem" = "postmortem", "prayin'" = "praying", "Prayin'" = "praying", "pretti" = "pretty", "ridin'" = "riding", "Sneakin'" = "sneaking",
                 "outta" = "out of", "'round" = "around", "self-implode" = "self implode", "shoulda" = "should have", "standin'" = "standing", "summer's" = "summer is", "There's" = "there is", "Thinkin'" = "thinking", 
                 "thankin'" = "thanking", "thinkin'" = "thinking", "'Til" = "until", "'til" = "until", "tryin'" = "trying", "tryna" = "trying to", "Tryna" = "trying to","twin-sized" = "twin sized", 
                 "waitin'" = "waiting", "white-knuckle" = "white knuckle",
                 "wanna" = "want to", "weepin'" = "weeping", "whatcha" = "what are you", "Where's" = "where is", "Why'd" = "why did",
                 "wide-eyed" = "wide eyed", "wonderin'" = "wondering", "Wonderin'" = "wondering"))) %>%
    mutate(lyric = map_chr(lyric, clean_text))
  
  return(df)
}


custom_stop_words <- bind_rows(stop_words, 
           tibble(word = c("ah", "dadadada", "ha", "haha", "hey", "hi", "huh", "la", "MMM", "mmm", "mmmm", "mm", "na", "oh", "okay","OK", "ok", "ooh", 
           "uh", "whoa", "woah", "yeah"), lexicon = "custom"))


# Function to clean text  (included in colloq function)
clean_text <- function(text) { 
  text %>% 
    replace_contraction() %>% 
    str_remove_all("[[:punct:]]") %>% 
    str_to_lower()
}

replacements_df <- read.csv("my_path/lem_replacements_df.csv", header = TRUE)

# Function to stem and lemmatize
pro <- function(words_df) {
  words_df <- words_df %>% filter(!is.na(word))
  processed_words <- words_df %>%
    mutate(stemmed_word = wordStem(word),
           lemmatized_word = lemmatize_words(stemmed_word)) %>%
    filter(lemmatized_word != "") %>%  # Filter out empty strings
    left_join(replacements_df, by = c("lemmatized_word" = "stemmed_word")) %>%
    mutate(lemmatized_word = ifelse(is.na(lemmatized_word.y), lemmatized_word, lemmatized_word.y)) %>%
    select(-lemmatized_word.y)  # Remove the extra column created during the join
}

nrc_sentiments <- get_sentiments("nrc") %>%
  filter(word != "count" & word != "heel" & word != "truck" & word != "wear" & word != "word")

common_theme <- theme(
  legend.position="NULL",
  plot.title = element_text(size = 15, face = "bold"),
  plot.subtitle = element_text(size = 12.5, face = "bold"),
  axis.title = element_text(size = 15, face = "bold"),
  axis.text = element_text(size = 15, face = "bold"),
  legend.title = element_text(size = 15, face = "bold"),
  legend.text = element_text(size = 15, face = "bold"))

sentiment_colors <- c("anger" = "red", "anticipation" = "green", "disgust" = "brown",
                      "fear" = "purple", "joy" = "yellow", "negative" = "gray", "positive" = "lightblue",
                      "sadness" = "blue", "surprise" = "pink", "trust" = "turquoise")

emotion_plot <- function(sentiment_summary, song, r){
  ggplot(sentiment_summary, aes(x = sentiment, y = word_count, fill = sentiment)) + 
    geom_col(color = "black") + 
    scale_fill_manual(values = sentiment_colors) +
    labs(title = paste(song, "# words by emotion"),
         subtitle = paste("emotional words / all (processed) words =", r), 
         x = "Emotion", 
         y = "Number of Words") + 
    common_theme + 
    theme(axis.text.x = element_text(angle = 45, hjust=1))
}

lemmatized_colors <- c("red", "blue", "green", "orange", 
                       "purple", "cyan", "magenta", "yellow", "brown", "pink")

word_plot <- function(word_summary, song, title){
  ggplot(word_summary, aes(x = lemmatized_word, y = total_count, fill = lemmatized_word)) + 
    geom_col(color = "black") + 
    scale_fill_manual(values = lemmatized_colors) +
    labs(title = paste(song, title),
         x = "", 
         y = "Frequency") + 
    common_theme +
    theme(axis.text.x = element_text(angle = 45, hjust=1))
}

all_albums <- c("Taylor Swift", "Fearless (Taylor's Version)", "Speak Now (Taylor's Version)", "Red (Taylor's Version)", "1989 (Taylor's Version)",
                "reputation", "Lover", "folklore", "evermore",  "Midnights", "THE TORTURED POETS DEPARTMENT")
number_of_tracks <- c(15, 26, 22, 30, 22, 15, 18, 17, 17, 19, 31)
original_release_year <- c(2006, 2008, 2010, 2012, 2014, 2017, 2019, 2020, 2020, 2022, 2024)
# The years are original release years, i.e., not "Taylor's version" years

#######################################

# Find song by word in title - could be lower case!
x <- "ricochet"
df1 <- taylor_album_songs %>%
   select(album_name, track_name) %>%
   filter(grepl(x, track_name))
n <- df1$album_name[1]
a <- which(all_albums == n)
a

# Find track number
x <- df1$track_name[1]
t <- taylor_album_songs %>%
  filter(album_name == n) %>%
  select(album_name, track_number, track_name) %>%
  filter(track_name == x) %>%
  select(track_number) %>%
  pull() %>% as.numeric()
t

Or instead, choose album a & track t   eg: Style: a=11, t=3
# a <- 8
# t <- 5

#######################################
song <- taylor_album_songs %>% 
  # filter(album_name == all_albums[a] ) %>%  # all songs in an album
  filter(album_name == all_albums[a] & track_number == t) %>% 
  pull(track_name[1])   # NOTE the track_name is always 1
if (length(song) == 0) print("No such track number/song") else print(song)
song <- gsub("\\(Taylor's Version\\)|\\[Taylor's Version\\]", '', song)
album <- all_albums[a]

df <- taylor_album_songs %>%
  filter(album_name == all_albums[a] & track_number == t) %>%
  select(lyrics) %>%
  unnest(lyrics) %>%
  select(line, lyric)
more_rows(df)   # Examine for colloquialisms; add to function colloq

df <- colloq(df)
more_rows(df)

words_df <- df %>%
  unnest_tokens(word, lyric) %>%
  anti_join(custom_stop_words, by = "word")
more_rows(words_df)

processed_words <- pro(words_df) 
more_rows(processed_words) # Examine for words incorrectly lemmatized; add to function pro

processed_words_count <- processed_words %>%
  select(lemmatized_word) %>%
  group_by(lemmatized_word) %>%
  summarize(total_count = n()) %>%
  arrange(desc(total_count)) %>%
  head(10)
more_rows(processed_words_count) 

total_processed_words <- nrow(processed_words) 

sentiment_analysis <- processed_words %>%
  inner_join(nrc_sentiments, by = c("lemmatized_word" = "word"),
             relationship = "many-to-many") %>%
  count(lemmatized_word, sentiment, sort = TRUE)

distinct_sentiment_words <- sentiment_analysis %>% 
  pull(lemmatized_word) %>%
  n_distinct()   

total_sentiment_words <- sentiment_analysis %>%
  select(lemmatized_word, n) %>%
  distinct() %>%
  group_by(lemmatized_word) %>%
  summarize(total_count = sum(n)) %>%
  summarize(total_count = sum(total_count)) %>%
  pull(total_count)   

r <- round(total_sentiment_words / total_processed_words, 3)
cat("emotional words = ", total_sentiment_words, ", total processed words = ", total_processed_words, ", ratio = ", r )

sentiment_summary <- sentiment_analysis %>%
  group_by(sentiment) %>%
  summarise(word_count = sum(n))
sentiment_summary

word_summary <- sentiment_analysis %>%
  select(lemmatized_word, n) %>%
  distinct() %>%
  group_by(lemmatized_word) %>%
  summarize(total_count = sum(n)) %>%
  arrange(desc(total_count)) %>%
  head(10)
word_summary   # check for word with unusual sentiment (truck = trust?); delete from NRC 

word_plot(processed_words_count, song, 
          title="10 most frequent processed words")
word_plot(word_summary, song, 
          title = "10 most frequent emotional words")
emotion_plot(sentiment_summary, song, r)

# End individual song analysis;  for all songs combined, in df <- taylor_album_songs %>%
# delete filter(album_name == all_albums[a] & track_number == t) %>%, and also song <- ""

dirty_words <- processed_words %>%
  select(lemmatized_word) %>%
  filter(, lemmatized_word=="fuck" | lemmatized_word=="shit" | lemmatized_word=="slut" | lemmatized_word=="bitch") %>%
  group_by(lemmatized_word) %>%
  mutate(lemmatized_word = str_replace(lemmatized_word, "fuck", "f**k")) %>%
  mutate(lemmatized_word = str_replace(lemmatized_word, "shit", "sh*t")) %>%
  mutate(lemmatized_word = str_replace(lemmatized_word, "slut", "sl*t")) %>%
  mutate(lemmatized_word = str_replace(lemmatized_word, "bitch", "b*tch")) %>%
  summarize(total_count = n())
dirty_words   # limited to these four; there are others :)

dirty_colors <- c("#FF0000","#000000","#800080","#FFA500")
ggplot(dirty_words, aes(x = lemmatized_word, y = total_count, fill = lemmatized_word)) + 
  geom_col(color = "black") + 
  scale_fill_manual(values = dirty_colors) +
  labs(title = "TS # of R-rated Words",
       x = "", 
       y = "Number of Words") + 
  common_theme 

# END

Tuesday, October 29, 2024

Sheldon Cooper's favorite number

If you are a fan of the television series "The Big Bang Theory", then you know Sheldon often wears a shirt with 73 on it.

In the tenth episode of the fourth season of The Bing Bang Theory, aired December 9, 2010, https://www.imdb.com/title/tt1632225/characters/nm1433588 , Sheldon asks what everyone’s favorite number is.

Raj suggests 5318008. Flip this number upside-down to see what Raj has in mind.

Sheldon declares the best number is 73 because it satisfies the following two properties: a. the product property, and b. the mirror property:

p=73 is the i = 21st prime number. The product of the digits of p, 7 and 3, equals i.
The reverse of the digits of 73 (its mirror) is 37. 37 is the 12th prime number. 12 is the reverse of 21, and 21 is the product of the digits of 73.

We will call a prime number satisfying both the product property and mirror property a “Sheldon Prime”.

Sheldon adds that 73 in base two, 1001001, is a palindrome number. However, many base two numbers are palindrome numbers; 5 in base two is 101.

This episode was the 73rd episode of Bing Bang. Actor Jim Parsons was born in 1973, and he was 37 years old when the episode aired. Coincidences?

The 73rd day of a non-leap year is March 14, also known as Pi Day.

73 is an emirp. An emirp (prime spelled backwards) is a prime number that results in a different prime when its digits are reversed.

Mathematicians Pomerance and Spicer prove that 73 is the only prime satisfying both the product property and mirror property. https://math.dartmouth.edu/~carlp/sheldon02132019.pdf . They also state that the only primes less than 10^10 satisfying the product property are p=17, i=7; p=73, i=21; and p=2475989, i=181440.

The following R code: a. verifies that 73 is the only Sheldon Prime within the first 10000 primes. b. verifies that 73 in base two, 1001001, is a palindrome number. c. flips Raj’s 5318008 number upside-down.

# a. 73 is the only Sheldon prime in first 10000 primes:

library(RcppAlgos)
# Function to calculate the product of digits
digit_product <- function(x) {
digits <- as.numeric(strsplit(as.character(x), NULL)[[1]])
digits <- digits[!is.na(digits)]
if (length(digits) == 0) return(NA)
prod(digits)
}

# Function to reverse digits of a number
reverse_digits <- function(x) {
as.numeric(paste(rev(strsplit(as.character(x), NULL)[[1]]), collapse = ""))
}

# Function to check if a number is prime
is_prime <- function(num) {
if (num <= 1) return(FALSE)
if (num == 2) return(TRUE)
if (num %% 2 == 0) return(FALSE)
for (i in 3:sqrt(num)) {
if (num %% i == 0) return(FALSE)
}
return(TRUE)
}

# Generate the first 10000 primes # stopped here for memory & speed issues
primes <- primeSieve(104730) # Generates first 10000 primes; the 10,000th prime is 104729
results <- list()

for (i in seq_along(primes)) {
if (digit_product(primes[i]) == i) { # product property
   reversed_prime <- reverse_digits(primes[i])
   if (is_prime(reversed_prime)) { # mirror property
    reversed_index <- which(primes == reversed_prime)
    if (reverse_digits(reversed_index) == digit_product(primes[i])){
     results <- append(results, list(c(prime = primes[i], index = i, reversed_prime = reversed_prime,     reversed_index = reversed_index)))
    }
   }
}
}

if (length(results) > 0) {
print(results)
} else {
print("No such prime found.")
}

# b. verify that 73 in base two, 1001001, is a palindrome number.

library(gtools)
x <- 73
v <- baseOf(x, base = 2)
w <- rev(v)
if (all(v == w)) {
cat(v, "is a palindrome number")
} else {
cat(v, " is not a palindrome number")
}

# c. flips Raj’s 5318008 number upside-down.

flip_string <- function(x) {
flip <- c('0' = '0', '1' = 'I', '2' = '2', '3' = 'E', '4' = '4', '5' = 'S', '6' = '9', '7' = '7', '8' = 'B', '9' = '6')
paste(rev(sapply(unlist(strsplit(as.character(x), NULL)), function(ch) flip[ch])), collapse = "")
}

Raj <- '5318008'
flipped_Raj <- flip_string(Raj)
flipped_Raj

# End

Friday, September 27, 2024

Rene Descartes walks into a bar

Rene Descartes walks into a bar, by Jerry Tuttle

I recently told the old Rene Descartes joke to a math class: Rene Descartes walks into a bar. The bartender asks, "Would you like a beer?" Descartes pauses for a moment and then replies, "I think not." Then poof - he disappears.

Of course I naively assumed my students had been exposed to the Descartes quote, "I think, therefore I am." Philosopher and mathematician Rene Descartes wrote this in French in 1637, and later in Latin as cogito, ergo sum."

After explaining the Descartes quote, I think the students understood the joke. Well, maybe it's not that funny.

But perhaps funnier to math people than you realize, is: this joke is logically flawed because the punchline is the inverse to the original conditional statement, and an inverse is not logically equivalent to the original statement.

Let P and Q be declarative sentences that can be definitively classified as either true or false. Then define:

Conditional Statement: If P, then Q.
Converse: If Q, then P.
Inverse: If not P, then not Q.
Contrapositive: If not Q, then not P

Two conditional statements are defined as logically equivalent when their truth values are identical for every possible combination of truth values for their individual declarative sentences.

P	Q	statement	converse	inverse	contrapositive
TRUE	TRUE	TRUE	TRUE	TRUE	TRUE
TRUE	FALSE	FALSE	TRUE	TRUE	FALSE
FALSE	TRUE	TRUE	FALSE	FALSE	TRUE
FALSE	FALSE	TRUE	TRUE	TRUE	TRUE

The above table shows statement and contrapositive have the same truth values in columns 3 and 6, and so are logically equivalent. Statement and inverse are not logically equivalent.

The Descartes quote is, "If I think, therefore I am", or "If P then Q". The punchline is, "If I don't think, therefore I am not", or "If not P, then not Q". The punchline is the inverse, and is not logically equivalent to the quote. If P is false, then "if P then Q" is true regardless of the value of Q. So Q can be either true or false.

Occasionally on television someone, often a police detective, will make a statement where they confuse a statement with its converse or inverse, and I have been known to yell at the television.

Descartes is known for developing analytic geometry, which uses algebra to describe geometry. Descartes' rule of signs counts the roots of a polynomial by examining sign changes in its coefficients.

And before someone else feels the need to say this, I will: "Don't put Descartes before the horse." This is perhaps the punchline to changing the original joke to "A horse walks into a bar ... "

The following is R code to create truth tables. Logical is a variable type in R. Conditional statements in R are created using the fact that “If P then Q” is equivalent to “Not P or Q”. I am defining the logic rules for statement, converse, inverse, contrapositive, but I could have defined the rules for more complicated statements as well.

# Define the possible values for P and Q
P <- c(TRUE, TRUE, FALSE, FALSE)
Q <- c(TRUE, FALSE, TRUE, FALSE)

# Calculate the 4 logical rules: statement, converse, inverse, contrapositive
# (Note that “if P then Q” is equivalent to “Not P or Q”.)
P_implies_Q <- !P | Q
Q_implies_P <- !Q | P
not_P_implies_not_Q <- P | !Q
not_Q_implies_not_P <- Q | !P

# expand.grid(P, Q) would also be a good start, but I wanted a specific ordering
# Create a data frame to display the truth table
truth_table <- data.frame(
P = P,
Q = Q,
`P -> Q` = P_implies_Q,
`Q -> P` = Q_implies_P,
`!P -> !Q` = not_P_implies_not_Q,
`!Q -> !P` = not_Q_implies_not_P
)

# Print the truth table
colnames(truth_table) <- c("P", "Q", "statement", "converse", "inverse", "contrapositive")
print(truth_table)

P_variable <- "I think"
Q_variable <- "I am"

colnames(truth_table) <- c(P_variable, Q_variable, "statement", "converse", "inverse", "contrapositive")
print(truth_table)

End

Sunday, September 15, 2024

The dead body problem, and the coffee with milk problem

Here are two problems in life we all commonly face: What time did the victim die, and is it better to put milk in your coffee right away or to wait?

In any police show with a murder, one of the detectives will always ask this math question: "What time did the victim die?" And yes, it is a math question, whose solution dates back over 300 years to Isaac Newton. For algebra and calculus students, it is probably more interesting than a lot of word problems they see.

Newton’s Law of Cooling states that the rate of change of the temperature of an object is proportional to the difference between its own temperature and the surrounding temperature.

         Let the temperature of the object be T(t) at time t.
         Let T0 denote the initial temperature of the object at time 0.
         Let Ts denote the temperature of the surrounding medium.

Newton's Law of Cooling states dT/dt = -k (T - Ts), where k is a cooling constant and where the negative sign is for the case where the body is getting cooler. The solution of the differential equation is:

T(t) = Ts + (T0 - Ts) * e^(-k*t)

Newton published his Law of Cooling in 1701. Modern physicists believe Newton's Law holds up well and approximates the result in some but not all situations. Large temperature differences between the object and its surroundings, or non-constant temperatures, will cause difficulties. I leave the details to the physicists.

Problem 1: The dead body problem

Example: A dead body was 80 degrees Fahrenheit when it was discovered. Two hours later it had cooled to 75 degrees. The room is 60 degrees. What time did the body die?

Assume T0 at time of death = 98.6
Assume Ts the temperature of surrounding room = 60
Let k be the unknown cooling constant. k depends on the characteristics of the object and its environment and is often between 0.1 and 0.2.

T(t) = Ts + (T0 − Ts) * e^(−kt)

First equation, at time of discovery:
Let t = t_0 be time of discovery. 80 = 60 + (98.6 - 60) * e^(-k*t_0)

Second equation, two hours later:
At t = t_0 + 2, 75 = 60 + (98.6 - 60) * e^(-k*(t_0 + 2))

Solve first equation for k:
80 = 60 + (98.6 - 60) * e^(-k*t_0)
20 = 38.6 * e^(-k*t_0)
e^(-k*t_0) = 20 / 38.6 = .518

Solve second equation for k:
75 = 60 + (98.6 - 60) * e^(-k*(t_0 + 2))
15 = 38.6 * e^(-k*(t_0 + 2))
e^(-k*(t_0 + 2)) = 15 / 38.6 = .389

Solve for k by taking the ratio of the two equations, noting t_0 drops out:
[e^(-k*(t_0 + 2))] / [e^(-k*t_0) ] = .389 /518
e^(-2k) = .751
-2k = LN(.751)
k = - LN(.751) / 2
k = .143

Solve for time since death from first equation
80 = 60 + (98.6 - 60) * e^(-.143*t_0)
20 = 38.6 * e^(-.143*t_0)
e^(-.143*t_0) = 20 / 38.6 = .518
-.143*t_0 = LN(.518)
t_0 = - LN(.518) / .143
t_o = 4.8 hours

The body had been cooling for 4.8 hours before it was discovered.

Problem 2: The coffee with milk problem

Here is a different application of Newton's Law of Cooling.

Example: Who drinks the hotter coffee after five minutes: The person who puts milk in right away and then waits five minutes, or the person who waits five minutes and then puts in milk?

Assume initial temperature of coffee, T0 = 200 F
Assume temperature of the room, Ts = 72 F
Assume initial temperature of the milk, Tm = 40 F
Assume the mixture is 80% coffee and 20% milk
Assume cooling constant k = 0.1.

Case 1: Add cold milk immediately

mixture T(0) = .80*200 + .20*40 = 168
mixture T(5) = 72 + (168 − 72) * e^(−0.1*5) = 130.23

Case 2: Wait 5 minutes, then add room temperature milk

coffee T(5) = 72 + (200 - 72)*e^(-0.1*5) = 149.64
milk T(5) = 72 + (40 - 72)*e^(-0.1*5) = 52.59
mixture T(5) = .80*149.64 + .20*52.59 = 130.23

The conclusion is that cases 1 and 2 give the same temperature! However, there is also a case 3.

Case 3: Wait 5 minutes, then add cold milk

coffee T(5) = 72 + (200 - 72)*e^(-0.1*5) = 149.64
mixture T(5) = .80*149.64 + .20*40 = 127.71

For simplicity, I ignored a few things that probably have a minor effect on the final temperature: I assume the mixing process is instantaneous. I assume the cup does not absorb heat from the coffee. I assume stirring the mixture does accelerate heat transfer. Perhaps there are others. I can not measure the effect of these.

Thank you, Isaac Newton (who as a Brit, perhaps drank tea?).

End

How do I count thee? Let me count the ways?