## LO Calc calculations

### LO Calc calculations

I am working on a large Calc spreadsheet for a client that contains a lot of historical data concerning a small village school, the original documentation for which originates from the 1880s and carries on through until at least the 1990s. The source documents have columns that contain the dates pupils were first registered, the dates when they left and their dates of birth. In each of the three sets of dates the information is shewn in three columns, one each for the day number, month number and the year. The requirement is for these to be aggregated into a single date column which I have done. Each of the original date columns are formatted as of type ‘Number -1234’ while the aggregated column I have added is formatted as ‘Date’ with the date code as ‘NNNND MMMM YYYY’ so that the display shews the day, day number, month and the year all in full. An example of the code I used to combine the three columns is '=DATEVALUE(D74 & "/" & E74 & "/" & F74'.

This is fine while the original data is complete but herein lies the problem. There are instances when the data in one or other of the three source date cells is blank. This results, not unexpectedly, in an error. (Err:502 – Invalid argument.) I need to refine the aggregated column so that it shews the remaining information. So, as a for instance, if the month is omitted in the source the aggregated column will show the day name and number and the year, or if say the year is omitted the aggregated column will shew the day name, day number and the month name, with, in each case a space, or if two pieces of information are missing, two spaces, where the missing information should be. I am at a loss on how to achieve this.

A further complication is that I need to add an additional column to shew the children’s full ages when they are first registered. I have the date of birth and the date of registration and can calculate their ages in total years by using the formulae ‘=YEAR(G7)-YEAR(N7)’ However this does not provide the age in days and months as well as the years. The ideal requirement here is that the child is so many days, so many months and so many years old at the date of registration although just months and years would be acceptable. Of course this will be further complicated when some of the original source data is omitted.

Is Calc capable of providing any of these additional requirements and if so how? They certainly are far beyond my capabilities. Can anyone offer any advice please?

This is fine while the original data is complete but herein lies the problem. There are instances when the data in one or other of the three source date cells is blank. This results, not unexpectedly, in an error. (Err:502 – Invalid argument.) I need to refine the aggregated column so that it shews the remaining information. So, as a for instance, if the month is omitted in the source the aggregated column will show the day name and number and the year, or if say the year is omitted the aggregated column will shew the day name, day number and the month name, with, in each case a space, or if two pieces of information are missing, two spaces, where the missing information should be. I am at a loss on how to achieve this.

A further complication is that I need to add an additional column to shew the children’s full ages when they are first registered. I have the date of birth and the date of registration and can calculate their ages in total years by using the formulae ‘=YEAR(G7)-YEAR(N7)’ However this does not provide the age in days and months as well as the years. The ideal requirement here is that the child is so many days, so many months and so many years old at the date of registration although just months and years would be acceptable. Of course this will be further complicated when some of the original source data is omitted.

Is Calc capable of providing any of these additional requirements and if so how? They certainly are far beyond my capabilities. Can anyone offer any advice please?

- JoeFootball
- Level 8
**Posts:**2214**Joined:**Tue Nov 24, 2009 1:52 pm**Location:**/home/usa/mn/minneapolis/joe

### Re: LO Calc calculations

Example:bigal wrote: There are instances when the data in one or other of the three source date cells is blank.

`=IF(ISBLANK(A1),1,A1)`

, where when cell A1 is blank, the value is 1, otherwise it's the value of A1.Your age requirements are going to be a tougher nut to crack.

### Re: LO Calc calculations

Thanks for the suggestion JoeFootball. My examples were incomplete in as much as I way already using an ‘if’ function to prevent a whole pile of errors in rows where data had not yet been entered. It is a version of the ‘if’ function I use a lot. e.g.
.

I didn’t shew it as it just seem to complicate the post. However, your alternative seem like it might be an alternative/better idea, with additional ‘if’ functions nested as required. It is certainly food for thought.

Tongue in cheek time. Thanks. You have just given me some after hours homework. Hmmm.

Actually I find problems like this very interesting.

I have some 2000 records left to add to the spreadsheet which, at the moment is around 30 columns wide and growing as I find more data. I’ll set up a dummy to experiment with and come back (much) later on.

Cheers

Code: Select all

`=IF(M7="","",DATEVALUE(K7 & "/" & L7 & "/" & M7))`

I didn’t shew it as it just seem to complicate the post. However, your alternative seem like it might be an alternative/better idea, with additional ‘if’ functions nested as required. It is certainly food for thought.

Tongue in cheek time. Thanks. You have just given me some after hours homework. Hmmm.

Actually I find problems like this very interesting.

I have some 2000 records left to add to the spreadsheet which, at the moment is around 30 columns wide and growing as I find more data. I’ll set up a dummy to experiment with and come back (much) later on.

Cheers

### Re: LO Calc calculations

Hibigal wrote: ↑Fri Feb 14, 2020 11:00 amA further complication is that I need to add an additional column to shew the children’s full ages when they are first registered. I have the date of birth and the date of registration and can calculate their ages in total years by using the formulae ‘=YEAR(G7)-YEAR(N7)’ However this does not provide the age in days and months as well as the years. The ideal requirement here is that the child is so many days, so many months and so many years old at the date of registration although just months and years would be acceptable. Of course this will be further complicated when some of the original source data is omitted.

**bigal**

I realise that this isn't exactly what you want, but ...

DoBirth..........DoEntry.........Age

02/12/01........05/09/06.......4 3/4

Formula = (B2-A2)/365

Format = FRACTION

I'll see if I can find my thinking hat for a better solution.

Jon

### Re: LO Calc calculations

To incorporate what JoeFootball suggested, change that to:bigal wrote: ↑Fri Feb 14, 2020 11:52 amThanks for the suggestion JoeFootball. My examples were incomplete in as much as I way already using an ‘if’ function to prevent a whole pile of errors in rows where data had not yet been entered. It is a version of the ‘if’ function I use a lot. e.g..Code: Select all

`=IF(M7="","",DATEVALUE(K7 & "/" & L7 & "/" & M7))`

Code: Select all

`=IF(M7="","",DATEVALUE(IF(ISBLANK(K7),1,K7) & "/" & IF(ISBLANK(L7),1,L7) & "/" & M7))`

[/quote]

If you subtract two dates, (RegistrationDate - DateOfBirth), you'll get the time between the two dates in days. You can either divide that directly as dorsetUK suggested, or you can format that "days between dates" difference as a date in YY/MM/DD format to get years, months and days between the two dates.bigal wrote: ↑Fri Feb 14, 2020 11:52 amI didn’t shew it as it just seem to complicate the post. However, your alternative seem like it might be an alternative/better idea, with additional ‘if’ functions nested as required. It is certainly food for thought.

Tongue in cheek time. Thanks. You have just given me some after hours homework. Hmmm.

Actually I find problems like this very interesting.

I have some 2000 records left to add to the spreadsheet which, at the moment is around 30 columns wide and growing as I find more data. I’ll set up a dummy to experiment with and come back (much) later on.

Cheers

### Re: LO Calc calculations

Birth ..............Entry

12/03/01......14/06/05

4 Years. Formula = YEARS(A1,B1,0)

3 Months. Formula = MONTHS(A1,B1,0)-12*B4

2 Days. Formula = B1-EDATE(A1,12*B4+B5)

Hope that's what you wanted. Edit; The Years formula is in B4.

Jon.

12/03/01......14/06/05

4 Years. Formula = YEARS(A1,B1,0)

3 Months. Formula = MONTHS(A1,B1,0)-12*B4

2 Days. Formula = B1-EDATE(A1,12*B4+B5)

Hope that's what you wanted. Edit; The Years formula is in B4.

Jon.

### Re: LO Calc calculations

To clarify. Here is an image of what I have at present

The code in cell N9 is
As the day number is missing from cell K9 the output I would like in cell N9 would be “May 1872”

If K9 actually contained a number, for argument sake 2 but the omission was in L9 the idea is that the output would show something like 2 --- 1872.

I am wondering if this should be a multi stepped process with additional but normally hidden columns providing some intermediate calculations.

The code in cell N9 is

Code: Select all

`=IF(M9="","",DATEVALUE(K9 & "/" & L9 & "/" & M9))`

If K9 actually contained a number, for argument sake 2 but the omission was in L9 the idea is that the output would show something like 2 --- 1872.

I am wondering if this should be a multi stepped process with additional but normally hidden columns providing some intermediate calculations.

### Re: LO Calc calculations

As a thought, this sounds like more of a solution for a Database than a spreadsheet. What is the purpose of it all?

Kernel: 4.15.0-46-generic x86_64 bits

Desktop: Cinnamon 3.8.9

Distro: Linux Mint 19 Tara

Laptop HP-ProBook-470-G2 8Gb RAM SSD

Server AMD Phenom 9650 - GEForce 9400GT 6Gb RAM

+ three other Mint machines

Desktop: Cinnamon 3.8.9

Distro: Linux Mint 19 Tara

Laptop HP-ProBook-470-G2 8Gb RAM SSD

Server AMD Phenom 9650 - GEForce 9400GT 6Gb RAM

+ three other Mint machines

**Out of my mind - please leave a message**### Re: LO Calc calculations

Your supposition is correct, lsemmens. The plan is to gather all the data into a spreadsheet and once that is finalised create a database from it. I am not very familiar with DB design so thought it best to have all the fields (columns) that are needed for the DB tables before sitting down and creating a UML diagram shewing all the links between the various tables in the DB. At present I am still unsure exactly sure if more columns are needed in the spreadsheet and, if they are, how they should be formatted hence, at least in part, my post.

In the meantime I have to speak to my client, hopefully some time this week, to clarify some more details of his requirements.

In the meantime I have to speak to my client, hopefully some time this week, to clarify some more details of his requirements.

### Re: LO Calc calculations

Have you any experience with database design and development? It not as simple as a spreadsheet in some ways, and more so in others. DO you know what I mean by a relational model?

Kernel: 4.15.0-46-generic x86_64 bits

Desktop: Cinnamon 3.8.9

Distro: Linux Mint 19 Tara

Laptop HP-ProBook-470-G2 8Gb RAM SSD

Server AMD Phenom 9650 - GEForce 9400GT 6Gb RAM

+ three other Mint machines

Desktop: Cinnamon 3.8.9

Distro: Linux Mint 19 Tara

Laptop HP-ProBook-470-G2 8Gb RAM SSD

Server AMD Phenom 9650 - GEForce 9400GT 6Gb RAM

+ three other Mint machines

**Out of my mind - please leave a message**### Re: LO Calc calculations

My inexperience in database design is the main reason I am using Calc first. I have designed a couple of small data bases but this will be a bigger project. I have, though, done a fair bit or reading up on design but putting into practice on this scale will be a new experience for me. And yes, I do understand the importance of relationships. This task is a bit of a learning curve for me which in turn is helpful to my client. i.e. cheaper.

### Re: LO Calc calculations

What is the volume of data you'll be working with?

LO Calc is probably just fine up into the tens of thousands of records, maybe even a few hundred thousand. IIRC, the upper limit for Calc is about a million rows.

If you're working with millions of records, then a database would be a better choice. LibreOffice Base might do the trick, you you might need a full on SQL database in a server/client model.

About working with dates. Date math (ages, days ellapsed between dates) is something that LibreOffice Calc handles well, and most databases can handle these types of calculations pretty well. There are limits to date math, though. First, it only works with precise dates. I'd strongly recommend not trying to get too "fuzzy". Footnote your results with something like "When no day of the month was available, we assumed the 15th. When only a year was available, we used July 1st of that year for our calculations." You could even give a count of how many "assumed dates" are in your dataset as a percentage of the total. If you do that, you'll have precise dates to do date math with. If you don't, then you'll be pulling your hair out trying to come up with "fuzzy functions" to handle the few dates with missing pieces.

The second thing I'll say is "days between dates" math (subtracting two date numbers) are generally accurate to the day with most date math libraries. Adding a number of days or subtracting a number of days is also generally accurate. So, for example, calculating someone's age in days by subtracting their birth date from today's date will give you an accurate age in days.

There are limitations to converting longer periods of days into years/days or years/months/days. For any arbitrary period, the limit of accuracy for expressing x days in years and days of the year is about +/- 2 days on the days of the year part, and for expressing x days in years, months and days, the limit of accuracy is +/- 5 days for an abitrary period length. To be exact, you have to have an actual date and the number of days reference. For arbitrary periods, leap days and the lengths of the various months introduce some ambiguity that can't be resolved without an actual reference date.

Using an actual example might help here. the number of days between today (February 15, 2020) and 5 years from today (February 15, 2025) is different from the number of days between March 15, 2020 and March 15, 2025. Both dates are exactly 5 years apart, but the number of days between those two dates differ by one (because there's 2 leap days in the first example and only one leap day in the second). The same thing happens, but with greater possible variations for periods of days and months, like 5 years, 6 months from today would be August 15, 2025 and 5 years, 6 months from March 15, 2020 would be September 15, 2025. Those periods differ by 2 days.

LO Calc is probably just fine up into the tens of thousands of records, maybe even a few hundred thousand. IIRC, the upper limit for Calc is about a million rows.

If you're working with millions of records, then a database would be a better choice. LibreOffice Base might do the trick, you you might need a full on SQL database in a server/client model.

About working with dates. Date math (ages, days ellapsed between dates) is something that LibreOffice Calc handles well, and most databases can handle these types of calculations pretty well. There are limits to date math, though. First, it only works with precise dates. I'd strongly recommend not trying to get too "fuzzy". Footnote your results with something like "When no day of the month was available, we assumed the 15th. When only a year was available, we used July 1st of that year for our calculations." You could even give a count of how many "assumed dates" are in your dataset as a percentage of the total. If you do that, you'll have precise dates to do date math with. If you don't, then you'll be pulling your hair out trying to come up with "fuzzy functions" to handle the few dates with missing pieces.

The second thing I'll say is "days between dates" math (subtracting two date numbers) are generally accurate to the day with most date math libraries. Adding a number of days or subtracting a number of days is also generally accurate. So, for example, calculating someone's age in days by subtracting their birth date from today's date will give you an accurate age in days.

There are limitations to converting longer periods of days into years/days or years/months/days. For any arbitrary period, the limit of accuracy for expressing x days in years and days of the year is about +/- 2 days on the days of the year part, and for expressing x days in years, months and days, the limit of accuracy is +/- 5 days for an abitrary period length. To be exact, you have to have an actual date and the number of days reference. For arbitrary periods, leap days and the lengths of the various months introduce some ambiguity that can't be resolved without an actual reference date.

Using an actual example might help here. the number of days between today (February 15, 2020) and 5 years from today (February 15, 2025) is different from the number of days between March 15, 2020 and March 15, 2025. Both dates are exactly 5 years apart, but the number of days between those two dates differ by one (because there's 2 leap days in the first example and only one leap day in the second). The same thing happens, but with greater possible variations for periods of days and months, like 5 years, 6 months from today would be August 15, 2025 and 5 years, 6 months from March 15, 2020 would be September 15, 2025. Those periods differ by 2 days.

### Re: LO Calc calculations

Well explained Racer-x I had forgotted all about date calculations, not that I've ever needed to do many over the years.

The only problem with a spreadsheet model is if you have, say, an address that is identical in several records and that address gets changed. a)do all the records reflect the change, if not, you end up with a situation of "which one is correct?"

The only problem with a spreadsheet model is if you have, say, an address that is identical in several records and that address gets changed. a)do all the records reflect the change, if not, you end up with a situation of "which one is correct?"

Kernel: 4.15.0-46-generic x86_64 bits

Desktop: Cinnamon 3.8.9

Distro: Linux Mint 19 Tara

Laptop HP-ProBook-470-G2 8Gb RAM SSD

Server AMD Phenom 9650 - GEForce 9400GT 6Gb RAM

+ three other Mint machines

Desktop: Cinnamon 3.8.9

Distro: Linux Mint 19 Tara

Laptop HP-ProBook-470-G2 8Gb RAM SSD

Server AMD Phenom 9650 - GEForce 9400GT 6Gb RAM

+ three other Mint machines

**Out of my mind - please leave a message**