We start by opening SimThyr (V.4.0.6) and creates a .tsv file (See this presentation https://www.glensbo.dk/circle/2023/02/06/create-and-use-scenarios-in-simthyr/)
Importing the tsv fil (Having found it place a tick in the box to the right click in the More option above and pick: Copy Folder Path to Clipboard. For me this gives: ~/Documents/RFolder/Markdown/CIRCLE/Pilo_Kubota_Year/KUBOTA_xml/ and then you just need to add the filename)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Kubota5_tsv <- read.table(file = '~/Documents/RFolder/Markdown/CIRCLE/Pilo_Kubota_Year/KUBOTA_xml/Kubota_5.tsv', sep = '\t', header = TRUE)
Specifically values for SPINA-GD, SPINA-GT and TSHI
What kind of columns and information is in the Kubota5 file?
str(Kubota5_tsv)
## 'data.frame': 630722 obs. of 11 variables:
## $ i : int NA 1 2 3 4 5 6 7 8 9 ...
## $ t : chr "day h:m:s" "1900-01-01 00:00:00" "1900-01-01 00:01:40" "1900-01-01 00:03:20" ...
## $ TRH : chr "ng/l" "2500.00" "3010.4316" "3673.7146" ...
## $ pTSH: chr "mU/l" "4.00" "4.00" "4.00" ...
## $ TSH : chr "mU/l" "12.7705" "12.7705" "12.9466" ...
## $ TT4 : chr "nmol/l" "46.3769" "46.3769" "46.3769" ...
## $ FT4 : chr "pmol/l" "6.7203" "6.7203" "6.7203" ...
## $ TT3 : chr "nmol/l" "1.8818" "1.8818" "1.8818" ...
## $ FT3 : chr "pmol/l" "3.1311" "3.1311" "3.1311" ...
## $ cT3 : chr "pmol/l" "4495.876" "4495.876" "4495.876" ...
## $ X : logi NA NA NA NA NA NA ...
Some rubbish columns and needed data and important a 2 row header which I would like to merge into a one row header. Picked this solution:
#https://stackoverflow.com/questions/17797840/reading-two-line-headers-in-r
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0 ✔ purrr 1.0.1
## ✔ tibble 3.1.8 ✔ stringr 1.5.0
## ✔ tidyr 1.3.0 ✔ forcats 1.0.0
## ✔ readr 2.1.3
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
# change to csv format
write.csv(Kubota5_tsv,'~/Documents/RFolder/MarkDown/CIRCLE/Pilo_Kubota_Year/KUBOTA_XML/Kubota5.csv')
header <- sapply(read.csv("~/Documents/RFolder/Markdown/CIRCLE/Pilo_Kubota_Year/KUBOTA_xml/Kubota5.csv",
nrow=2,
header = FALSE),
paste,
collapse="_")
result <- read.csv("~/Documents/RFolder/Markdown/CIRCLE/Pilo_Kubota_Year/KUBOTA_XML/Kubota5.csv", skip=2, col.names=header)
str(result)
## 'data.frame': 630720 obs. of 12 variables:
## $ NA_1 : int 3 4 5 6 7 8 9 10 11 12 ...
## $ i_NA : int 2 3 4 5 6 7 8 9 10 11 ...
## $ t_day.h.m.s: chr "1900-01-01 00:01:40" "1900-01-01 00:03:20" "1900-01-01 00:05:00" "1900-01-01 00:06:40" ...
## $ TRH_ng.l : num 3010 3674 5036 2827 2380 ...
## $ pTSH_mU.l : num 4 4 4 4 4 4 4 4 4 4 ...
## $ TSH_mU.l : num 12.8 12.9 13.2 13.6 13.7 ...
## $ TT4_nmol.l : num 46.4 46.4 46.4 46.4 46.4 ...
## $ FT4_pmol.l : num 6.72 6.72 6.72 6.72 6.72 ...
## $ TT3_nmol.l : num 1.88 1.88 1.88 1.88 1.88 ...
## $ FT3_pmol.l : num 3.13 3.13 3.13 3.13 3.13 ...
## $ cT3_pmol.l : num 4496 4496 4496 4496 4496 ...
## $ X_NA : logi NA NA NA NA NA NA ...
There we are - and then I calculate additional columns
library(SPINA)
result <- result %>% mutate(ratio = FT3_pmol.l/FT4_pmol.l,
TSH_Sum = (FT4_pmol.l*0.52)+(FT3_pmol.l*0.38)+((FT4_pmol.l+FT3_pmol.l)*0.1),
TSH_TSH_Sum = TSH_mU.l/TSH_Sum,
SPINA_GT = SPINA.GT(result$TSH_mU.l, result$FT4_pmol.l),
SPINA_GD = SPINA.GD(result$FT4_pmol.l, result$FT3_pmol.l),
TSHI = estimated.TSHI(result$TSH_mU.l, result$FT4_pmol.l),
TRH_TSH = TRH_ng.l/TSH_mU.l,
TT4_FT4 = TT4_nmol.l/FT4_pmol.l,
TT3_FT3 = TT3_nmol.l/FT3_pmol.l,
sqrtTSH = sqrt(TSH_mU.l),
sqrtTRH = sqrt(TRH_ng.l),
sqrtFT4 = sqrt(FT4_pmol.l),
sqrtFT3 = sqrt(FT3_pmol.l),
sqrtTT4 = sqrt(TT4_nmol.l))
str(result)
## 'data.frame': 630720 obs. of 26 variables:
## $ NA_1 : int 3 4 5 6 7 8 9 10 11 12 ...
## $ i_NA : int 2 3 4 5 6 7 8 9 10 11 ...
## $ t_day.h.m.s: chr "1900-01-01 00:01:40" "1900-01-01 00:03:20" "1900-01-01 00:05:00" "1900-01-01 00:06:40" ...
## $ TRH_ng.l : num 3010 3674 5036 2827 2380 ...
## $ pTSH_mU.l : num 4 4 4 4 4 4 4 4 4 4 ...
## $ TSH_mU.l : num 12.8 12.9 13.2 13.6 13.7 ...
## $ TT4_nmol.l : num 46.4 46.4 46.4 46.4 46.4 ...
## $ FT4_pmol.l : num 6.72 6.72 6.72 6.72 6.72 ...
## $ TT3_nmol.l : num 1.88 1.88 1.88 1.88 1.88 ...
## $ FT3_pmol.l : num 3.13 3.13 3.13 3.13 3.13 ...
## $ cT3_pmol.l : num 4496 4496 4496 4496 4496 ...
## $ X_NA : logi NA NA NA NA NA NA ...
## $ ratio : num 0.466 0.466 0.466 0.466 0.466 ...
## $ TSH_Sum : num 5.67 5.67 5.67 5.67 5.67 ...
## $ TSH_TSH_Sum: num 2.25 2.28 2.33 2.4 2.42 ...
## $ SPINA_GT : num 0.62 0.619 0.616 0.613 0.612 ...
## $ SPINA_GD : num 43.1 43.1 43.1 43.1 43.1 ...
## $ TSHI : num 3.45 3.46 3.48 3.51 3.52 ...
## $ TRH_TSH : num 236 284 381 208 173 ...
## $ TT4_FT4 : num 6.9 6.9 6.9 6.9 6.9 ...
## $ TT3_FT3 : num 0.601 0.601 0.601 0.601 0.601 ...
## $ sqrtTSH : num 3.57 3.6 3.63 3.69 3.71 ...
## $ sqrtTRH : num 54.9 60.6 71 53.2 48.8 ...
## $ sqrtFT4 : num 2.59 2.59 2.59 2.59 2.59 ...
## $ sqrtFT3 : num 1.77 1.77 1.77 1.77 1.77 ...
## $ sqrtTT4 : num 6.81 6.81 6.81 6.81 6.81 ...
Next I want to get rid of col 1 and 2 and col 12.
result <- subset(result[c(3:11,13:26)])
str(result)
## 'data.frame': 630720 obs. of 23 variables:
## $ t_day.h.m.s: chr "1900-01-01 00:01:40" "1900-01-01 00:03:20" "1900-01-01 00:05:00" "1900-01-01 00:06:40" ...
## $ TRH_ng.l : num 3010 3674 5036 2827 2380 ...
## $ pTSH_mU.l : num 4 4 4 4 4 4 4 4 4 4 ...
## $ TSH_mU.l : num 12.8 12.9 13.2 13.6 13.7 ...
## $ TT4_nmol.l : num 46.4 46.4 46.4 46.4 46.4 ...
## $ FT4_pmol.l : num 6.72 6.72 6.72 6.72 6.72 ...
## $ TT3_nmol.l : num 1.88 1.88 1.88 1.88 1.88 ...
## $ FT3_pmol.l : num 3.13 3.13 3.13 3.13 3.13 ...
## $ cT3_pmol.l : num 4496 4496 4496 4496 4496 ...
## $ ratio : num 0.466 0.466 0.466 0.466 0.466 ...
## $ TSH_Sum : num 5.67 5.67 5.67 5.67 5.67 ...
## $ TSH_TSH_Sum: num 2.25 2.28 2.33 2.4 2.42 ...
## $ SPINA_GT : num 0.62 0.619 0.616 0.613 0.612 ...
## $ SPINA_GD : num 43.1 43.1 43.1 43.1 43.1 ...
## $ TSHI : num 3.45 3.46 3.48 3.51 3.52 ...
## $ TRH_TSH : num 236 284 381 208 173 ...
## $ TT4_FT4 : num 6.9 6.9 6.9 6.9 6.9 ...
## $ TT3_FT3 : num 0.601 0.601 0.601 0.601 0.601 ...
## $ sqrtTSH : num 3.57 3.6 3.63 3.69 3.71 ...
## $ sqrtTRH : num 54.9 60.6 71 53.2 48.8 ...
## $ sqrtFT4 : num 2.59 2.59 2.59 2.59 2.59 ...
## $ sqrtFT3 : num 1.77 1.77 1.77 1.77 1.77 ...
## $ sqrtTT4 : num 6.81 6.81 6.81 6.81 6.81 ...
Now the data frame is in a format that I can use for different visualisations.