Water injection into California oil and gas wells

Data and R code to reproduce the analysis underlying this Sep. 18, 2022 Inside Climate News article, examining steam/water reported as being injected into oil and gas wells for fossil fuel extraction in Kern County and the rest of California.

Data and setting up

The data covers the period 2018-2021 and is derived from public database downloads provided by the Geologic Energy Management Division of the California Department of Conservation (CalGEM). Since 2018, this data has been released as SQL Server backup files, each covering a single year. We downloaded the data on June 20, 2022 and then converted individual tables from these files to CSV files and then processed the data into RData format using the script data_processing.R.

CalGEM provides data on steam/water injection in quarterly reports and monthly reports for each well, which we joined to data for each well to identify the locations by county and operators. The quarterly reports contain more information on the sources and quality of water and were used for most of the analysis. However, to address concerns about data quality, we also ran some analyses from the monthly reports. In the CalGEM data, water quantities are given in barrels. The calculations below convert to gallons by multiplying these values by 42.

# load required packages
library(tidyverse)
library(lubridate)
library(DT)

# load processed data
load("calgem.RData")

# data processing to allow analysis comparing Kern County to the rest of the state, whether water was suitable in its untreated state for domestic use or irrigation, and whether any treatment was applied.
q_inject <- q_inject %>%
  mutate(county2 = case_when(county == "Kern" ~ "Kern County",
                             TRUE ~ "Other"),
         water_treated = paste(water_treated_deoiling,water_treated_disinfection,water_treated_desalinization,water_treated_membrane,water_treated_other),
         water_suitable_treated = case_when(water_suitable == "Y" ~ "yes",
                                            water_suitable == "N" & grepl("Y",water_treated) ~ "no_treated",
                                            TRUE ~ "no_untreated"))

m_inject <- m_inject %>%
  mutate(county2 = case_when(county == "Kern" ~ "Kern County",
                             TRUE ~ "Other")) 

Use of high-quality water for oil and gas extraction

Most of the water/steam injected into oil and gas wells is wastewater, primarily “produced” water, which comes from wells along with oil and gas when the fuels are extracted. However, some high-quality water, marked as suitable for domestic or irrigation use without treatment, is also injected, which is a concern in drought-prone California.

This water primarily comes from:

  • bodies of surface water including streams, lakes, and the State Water Project, a massive system of dams and aqueducts that ferries rain and snowmelt from the Sierra Nevada mountains to parched Southern California.
  • domestic water supplies.

For this analysis, we removed any water used in offshore operations and injection into wells marked with the well_type_code "WD," which designates disposal of water, rather than its use in fossil fuel extraction.

q_inject %>%
  filter(!grepl("Offshore",county) & well_type_code != "WD" & water_suitable == "Y") %>%
  group_by(county2) %>%
  summarize(gallons_injected = sum(steam_water_injected_bbl, na.rm = TRUE) * 42) %>%
  mutate(percent = round(100 * gallons_injected/sum(gallons_injected),2),
         gallons_injected = prettyNum(gallons_injected, big.mark = ",")) %>%
  datatable()

Kern County accounts for about three-quarters of California's oil and gas production. Even so, quarterly injection reports indicate that operators in the county used a disproprotionate quantity of water marked as suitable for domestic or irrigation use without treatment, based on measurements on the quantity of dissolved solids. Kern County accounted for more than 99.5 percent of the use of this high-quality water across California.

q_inject %>%
  filter(!grepl("Offshore",county) & well_type_code != "WD") %>%
  group_by(water_suitable_treated, county2) %>%
  summarize(gallons_injected = sum(steam_water_injected_bbl, na.rm = TRUE) * 42) %>%
  pivot_wider(names_from = water_suitable_treated, values_from = gallons_injected) %>%
  mutate_all(~replace(., is.na(.), 0)) %>%
  rowwise() %>%
  mutate(
    total = sum(c(no_treated,no_untreated,yes)),
    no_treated_pc = round(100 * no_treated/total,2),
    no_untreated_pc = round(100 * no_untreated/total,2),
    yes_pc = round(100 * yes/total,2),
    no_treated = prettyNum(no_treated, big.mark = ","),
    no_untreated = prettyNum(no_untreated, big.mark = ","),
    yes = prettyNum(yes, big.mark = ","),
    total = prettyNum(total, big.mark = ",")
  ) %>% 
  select(-total) %>%
  datatable()

In Kern County, almost 1.3 percent of the water injected into oil and gas wells for oil and gas extraction was marked as being suitable for domestic or irrigation use without treatment, compared to around one hundredth of one percent for the rest of the state.

Use of surface water, including the State Water Project

q_inject %>%
  filter(!grepl("Offshore",county) & well_type_code != "WD" & water_source_text == "Surface Water") %>%
  group_by(water_suitable_treated) %>%
  summarize(gallons_injected = sum(steam_water_injected_bbl, na.rm = TRUE) * 42) %>%
  mutate(percent = round(100 * gallons_injected/sum(gallons_injected),2),
         gallons_injected = prettyNum(gallons_injected, big.mark = ",")) %>%
  datatable()

All of the surface water injected into oil and gas wells for fossil fuel extraction was marked as being suitable for domestic use or irrigation without treatment. We then looked at the use of this water by year.

q_inject %>%
  filter(!grepl("Offshore",county) & well_type_code != "WD" & water_source_text == "Surface Water" & water_suitable == "Y") %>%
  group_by(county2, year) %>%
  summarize(gallons_injected = sum(steam_water_injected_bbl, na.rm = TRUE) * 42) %>%
  mutate(gallons_injected = prettyNum(gallons_injected, big.mark = ",")) %>%
  pivot_wider(names_from = county2, values_from = gallons_injected) %>%
  mutate_all(~replace(., is.na(.), 0)) %>%
  datatable()

Kern County accounted for almost all the surface water injected into oil and gas wells in California for fossil fuel extraction. The use of this water declined from 2018 to 2021.

We then looked at the sources of this water used in Kern County, which required some further data cleaning.

kern_surface <- q_inject %>%
  filter(!grepl("Offshore",county) & well_type_code != "WD" & water_source_text == "Surface Water" & water_suitable_treated == "yes" & county == "Kern") 
  
unique(kern_surface$water_source_name)
## [1] "California Aqueduct" "West Kern WSD"       "CALIFORNIA AQUEDUCT"
## [4] "4"

The California Aqueduct is part of the State Water Project, so these entries could be consolidated.

kern_surface <- kern_surface %>%
  mutate(water_source_name2 = case_when(grepl("California Aqueduct", water_source_name, ignore.case = TRUE) ~ "State Water Project",
                                       TRUE ~ water_source_name))

kern_surface %>%
  group_by(water_source_name2) %>%
  summarize(gallons_injected = sum(steam_water_injected_bbl, na.rm = TRUE) * 42) %>%
  mutate(percent = round(100 * gallons_injected/sum(gallons_injected),2),
         gallons_injected = prettyNum(gallons_injected, big.mark = ",")) %>%
  arrange(-percent) %>%
  datatable()

"4" appears to be a data entry error, with the operator entering the numerical code for surface water rather than a named water source.

The vast majority of the surface water injected into oil and gas wells for fossil fuel extraction in Kern County came from the State Water Project. Which operators used this water?

kern_surface %>%
  filter(water_source_name2 == "State Water Project") %>%
  group_by(operator_name) %>%
  summarize(gallons_injected = sum(steam_water_injected_bbl, na.rm = TRUE) * 42) %>%
  arrange(-gallons_injected) %>%
  mutate(percent = round(100 * gallons_injected/sum(gallons_injected),2),
         gallons_injected = prettyNum(gallons_injected, big.mark = ","))  %>%
  arrange(-percent) %>%
  datatable()

Berry Petroleum accounted for about two-thirds of the high-quality water from the State Water Project injected into oil and gas wells for fossil fuel extraction in Kern County.

We then looked at the use of water from the State Water Project in Kern County by year.

kern_surface %>%
  filter(water_source_name2 == "State Water Project") %>%
  group_by(year) %>%
  summarize(gallons_injected = sum(steam_water_injected_bbl, na.rm = TRUE) * 42) %>%
  mutate(gallons_injected = prettyNum(gallons_injected, big.mark = ",")) %>%
  datatable

The diversion of water marked as coming from the State Water Project for injection into oil and gas wells in Kern County has been falling in recent years, according to CalGEM’s data, from more than 234 million gallons in 2018 to around 58 million gallons in 2021.

From domestic water suppliers

q_inject %>%
  filter(!grepl("Offshore",county) & well_type_code != "WD" & water_source_text == "Domestic Water System") %>%
  group_by(water_suitable_treated) %>%
  summarize(gallons_injected = sum(steam_water_injected_bbl, na.rm = TRUE) * 42) %>%
  mutate(percent = round(100 * gallons_injected/sum(gallons_injected),2),
         gallons_injected = prettyNum(gallons_injected, big.mark = ",")) %>%
  arrange(-percent) %>%
  datatable()

A quarter of the water listed as being obtained from domestic water suppliers was marked as being suitable for domestic or irrigation use in its untreated state. We then looked at the use of this high-quality water by year.

q_inject %>%
  filter(!grepl("Offshore",county) & well_type_code != "WD" & water_source_text == "Domestic Water System" & water_suitable == "Y") %>%
  group_by(county2,year) %>%
  summarize(gallons_injected = sum(steam_water_injected_bbl, na.rm = TRUE) * 42) %>%
  mutate(gallons_injected = prettyNum(gallons_injected, big.mark = ",")) %>%
  pivot_wider(names_from = county2, values_from = gallons_injected) %>%
  mutate_all(~replace(., is.na(.), 0)) %>%
  datatable()

Again, Kern County accounted for the vast majority of the high-quality water from domestic suppliers. In 2021, as the county entered a severe drought, the use of this water appeared to increase.

We then looked at the suppliers of this water in Kern County, which required some further data cleaning.

kern_domestic_suitable <- q_inject %>%
  filter(!grepl("Offshore",county) & well_type_code != "WD" & water_source_text == "Domestic Water System" & water_suitable == "Y" & county == "Kern") 
  
unique(kern_domestic_suitable$water_source_name)
## [1] "West Kern"                "West Kern Water"         
## [3] "West Kern Water District" "03"

Three of these entries seem to represent the West Kern Water District, so could be consolidated.

kern_domestic_suitable <- kern_domestic_suitable %>%
  mutate(water_source_name2 = case_when(grepl("West Kern",water_source_name) ~ "West Kern Water District",
                                        TRUE ~ water_source_name)) 
  
kern_domestic_suitable %>%
  group_by(water_source_name2) %>%
  summarize(gallons_injected = sum(steam_water_injected_bbl, na.rm = TRUE) * 42) %>%
  mutate(percent = round(100 * gallons_injected/sum(gallons_injected),2),
         gallons_injected = prettyNum(gallons_injected, big.mark = ",")) %>%
  arrange(-percent) %>%
  datatable()

"03" appears to be a data entry error, with the operator repeating the code for water from a domestic supplier, rather than entering the name of that supplier.

The vast majority, if not all, of this water came from the West Kern Water District. We then looked at the operators using high-quality water marked as coming from the West Kern Water District.

kern_domestic_suitable %>%
  filter(water_source_name2 == "West Kern Water District") %>%
  group_by(water_source_name2, operator_name) %>%
  summarize(gallons_injected = sum(steam_water_injected_bbl, na.rm = TRUE) * 42) %>%
  mutate(percent = round(100 * gallons_injected/sum(gallons_injected),2),
         gallons_injected = prettyNum(gallons_injected, big.mark = ",")) %>%
  arrange(-percent) %>%
  datatable()

Sentinel Peak Resources California accounted for around two-thirds of the use of high-quality water supplied by the West Kern Water District. We then looked at its use of this water by year.

kern_domestic_suitable %>%
  filter(water_source_name2 == "West Kern Water District" & grepl("Sentinel",operator_name)) %>%
  group_by(year,water_source_name2,operator_name) %>% 
  summarize(gallons_injected = sum(steam_water_injected_bbl, na.rm = TRUE) * 42) %>%
  mutate(gallons_injected = prettyNum(gallons_injected, big.mark = ",")) %>%
  datatable()

We next looked at the water from domestic suppliers marked as not suitable for domestic or irrigation use in its untreated state. The vast majority of this water was marked as being subjected to some treatment before injection into for use in fossil fuel extraction (see above).

q_inject %>%
  filter(!grepl("Offshore",county) & well_type_code != "WD" & water_source_text == "Domestic Water System" & water_suitable == "N") %>%
  group_by(county2, year) %>%
  summarize(gallons_injected = sum(steam_water_injected_bbl, na.rm = TRUE) * 42) %>%
  mutate(gallons_injected = prettyNum(gallons_injected, big.mark = ",")) %>%
  pivot_wider(names_from = county2, values_from = gallons_injected) %>%
  mutate_all(~replace(., is.na(.), 0)) %>%
  datatable()

We then looked at the suppliers of this water in Kern County, which required some further data cleaning, as before.

kern_domestic_not_suitable <- q_inject %>%
  filter(!grepl("Offshore",county) & well_type_code != "WD" & water_source_text == "Domestic Water System" & water_suitable== "N" & county == "Kern") 
  
unique(kern_domestic_not_suitable$water_source_name)
## [1] "Lokern - EOR Steam"          "West Kern Water - EOR Steam"
## [3] "West Kern Water"             "West Kern Water District"

All of the entries marked "Lokern - EOR Steam" were from reports made by Chevron. Enquiries to the company revealed that these entries actually referred to water from the West Kern Water District. So entries corresponding to the West Kern Water District could be consolidated.

kern_domestic_not_suitable <- kern_domestic_not_suitable %>%
  mutate(water_source_name2 = case_when(grepl("kern",water_source_name, ignore.case = TRUE) ~ "West Kern Water District",
                                        TRUE ~ water_source_name)) 
  
kern_domestic_not_suitable %>%
  group_by(water_source_name2) %>%
  summarize(gallons_injected =