Skip to content

What is best way to calculate % of time?

7 messages · Bert Gunter, Neotropical bat risk assessments, John Kane +1 more

#
Hi Bruce,
You replied just to me. I have taken the liberty of cc:ing R-help as there
lots of more knowledgeable people than me there who may be able to help.
In the meantime I remain confused.
Here is my impression of the sample data that you supplied. I have combined
Date & Time into a single POSIXct variable, dtime. Just paste it into
<b>R</b>
##===============================================================##
dat2 <- structure(list(Species = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label =
c("Buzz", "Ptedav", "Ptemes"), class = "factor"), Location = c(7716L,
7716L, 7716L, 7716L, 7716L, 7716L, 7716L, 7717L, 7717L, 7717L, 7717L,
7717L, 7717L, 7717L, 7717L, 7717L, 7717L, 7717L, 7717L, 7717L, 7717L,
7717L, 7717L, 7717L, 7717L, 7717L, 7717L, 7717L, 7717L, 7717L, 7717L,
7717L, 7717L, 7717L, 7717L, 7717L, 7717L, 7717L, 7717L, 7717L, 7717L,
7717L), dtime = structure(c(948758700, 948758700, 948758700, 948761220,
948761220, 948761220, 948761220, 962655420, 962655420, 962655420,
962655420, 962656200, 962656200, 962656200, 962656200, 962655240,
962655300, 962655300, 962655300, 962655300, 962655300, 962655420,
962655420, 962655420, 962655480, 962655480, 962655480, 962655480,
962655480, 962666100, 962666460, 962666520, 962666580, 962666700,
962666760, 962666820, 962666880, 962666940, 962667180, 962667300,
962667360, 962667420), class = c("POSIXct", "POSIXt"), tzone = "UTC")),
class = "data.frame", row.names = c(NA, -42L))
##===============================================================##
<b>The 6 letter species codes relate to individual bat species and the Buzz
= Feeding buzz that indicates a feeding attempt by a given bat. So the
"codes" are both species and information on the call type.</b>
But, at the moment you have two variables in the one column, Species: The
type of bat and feeding behaviour.
<b>The date/time is when the species was recorded and is linked to the
location.</b>
Okay. Will this give us a unique key?
<b>Therefore to run the summary stats I need I will need to remove the
duplicate times that are rounded to the minute</b>
What duplicate times? Where are they? When are they rounded?
I have never used Access. Will it produce a data dictionary? Can it export
a small subset of the relevant data to another Access DB, some other DB or
in .csv format? At the moment I just cannot visualize what your data layout
looks like.
Can you point us to any documentation that explains what information in
being gathered?preferably in simple?minded English?

On Wed, 25 Dec 2019 at 12:08, Neotropical bat risk assessments <
neotropical.bats at gmail.com> wrote:

            

  
    
#
Tnx John,
Yep failed to "reply all" my bad.

Yes the mix of "information" on call type and species are in the same 
field.? It will be a single mouse click to export only the "buzz" data 
from the master DB as a separate CSV file from the species data. So this 
could be a new DF/data set for R.
This is due to the legacy issues of how the acoustic data are added to 
the metadata of the bat call recordings.
Combining date and times does not provide for sampling nights that roll 
over after midnight.

I may need to reread Hadley's Tidy Data manifesto re: data handling ;-).

The location, date and time does provide unique variables.
Duplicate times mentioned are the "duplicated" values you noted. This 
happens as the actual call files include seconds for a more precise time 
when the recordings were made.? Rounding to nearest minute suffices for 
a summary of total minutes spent? with "feeding attempts" vs total 
active time.

The data being gathered is reviewed in a purpose build bat acoustic 
software program when reviewing bat call files.? The metadata include 
the "Who", "Where" & "When" recorded. What is added to this as the 
acoustic files are reviewed are information on call types and species IDs.

This metadata is exported as a TXT file and imported into a master 
Access DB I developed over the past 15 years to manage "BIG DATA" as 
they say.? As a note I currently it have >1.9 million acoustic call 
records store in the relational DB.

The data output/exported from the DB is sufficient to provide wonderful 
temporal activity plots using GGplot2.? Original code for this was 
developed with huge assistance from Hadley eons ago and updated to more 
recent R releases and packages by a few others.

The graphics are great to visualize temporal activity but do not provide 
a simple summary of amount of time spent "foraging AKA feeding buzz 
data" vs total activity time for each species.

Perhaps this is /was not a simple question on how to summarize time data 
to derive a % of each category, be it "buzz" or species.

Tnx again,
Bruce

  
    
#
I will not get into your explanation of details that, like John, I find
opaque. Please DO read Hadley's manifesto, as it appears that you need to
organize your data more appropriately.

AFAICS, however, strictly speaking your data cannot answer the question you
have posed. **Strictly speaking** to know the proportion of active time
bats spend feeding, **for each bat** you would need to know when it is
active and when it is feeding during that time. You could then summarize
this for all bats (e.g. take the average or median proportion) in a species
or whatever. As you cannot identify individual bats in your data, you
cannot do this -- i.e. you cannot answer your question.

So the question then becomes: precisely **how** exactly do you propose
using the data you have to determine when a *group* of bats are active and
when they are feeding? How are the groups explicitly identified and how are
their times active and feeding determined? In short, you need to have
information that is something like:

Bat.Group   date   active.time.start  active.time.end  feeding.time.start
feeding.time.end

( for a given date and bat group, there may be many multiple entries;
perhaps for a given group, date, and active time start and end, several
feeding time start/stop entries ( I have no idea how bats behave)).

Until you can expicitly explain how your data can generate such
information, I think it will be difficult/impossible to help you.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Dec 25, 2019 at 1:52 PM Neotropical bat risk assessments <
neotropical.bats at gmail.com> wrote:

            

  
  
#
Hi Bert,

Tnx for taking time to reply.
For clarification... the data do EXPLICITLY indicate when each species 
is active and when a feeding buzz is recorded.
That is ALL it provides based on acoustic data recorded in the field.? 
Only when a species is recorded? is it identified as active.
How this is accomplished is of no importance to the question I asked.

Note this is Not "individuals" per se. but species as a group.

I appreciate you taking time to reply.
Clearly this is not a simple solution to what I assumed to be a simple 
question.
Restated as...
*How best to use R to calculate occurrence of event( (A) over time vs 
all events (b...n) over the same time period give the data frame work I 
have.*

Cheers,
Bruce

  
  
#
LyX Document
Hi Bruce,
<b> Combining date and times does not provide for sampling nights that roll
over after midnight. </b>
Ah yesss legacies
<b>The location, date and time does provide unique variables.</b>
Ah, i thought so
<b>Rounding to nearest minute suffices for a summary of total minutes spent
with "feeding attempts" vs total active time.</b>
Okay, that removes my worry about durations. We can just treat each entry
as one elapsed minute?
however I still do not grasp the duplicated issue. we have in my dataframe:
Species Location dtime
Ptedav 7717 2000-07-03 20:15:00
Ptedav 7717 2000-07-03 20:15:00
Ptedav 7717 2000-07-03 20:15:00
Ptedav 7717 2000-07-03 20:15:00
Ptedav 7717 2000-07-03 20:15:00
I assume that this represents 5 separate recording but that they can be
collapsed into one 1-minute data point?  If so then would not all you need
to do is run a simple table() command? To handle the Buzz one mould produce
the Buzz data.frame and merge it with the new species data.frame?
I must be missing something. It looks too simple.



On Wed, 25 Dec 2019 at 18:11, Neotropical bat risk assessments <
neotropical.bats at gmail.com> wrote:

            

  
    
1 day later
#
Well

If you can make ggplot based on your data, there should be a way to produce 
summary. Just as curiosity, the data you showed us are the same as you use for 
ggplot construction?

Maybe I misunderstood your question but let's assume you have records from 
each location but only BUZZ time is indicated rounded to minutes.

I would use few steps
First step - sort data frame according to time and aggregate it to get minutes 
of BUZZ time during a time period

Second step - merge artificial data frame with full time in minutes (either 
525600 or  527040 rows if full day should be considered) with your BUZZ data 
frame. This is partly complicated step if you want to consider only night time 
and before merging you should remove daytime. Or maybe you are already able to 
extract one data frame with Buzz activity and one data frame with timespan for 
each day. Again the result of this merging step should be data frame in which 
you have one time column for total time in minutes and one column indicating 
when the buzz was observed.

Third step - aggregate resulting data frame to get BUZZ time in each day 
either by table as suggested or by ?aggregate

Based on data from John

Here is aggregated data frame
dat2.ag<- aggregate(dat2$dtime, list(dat2$Species, dat2$Location, 
format(dat2$dtime, "%d.%m.%Y %H:%M")), min)

And result of table
7716 7717
  Buzz      2    2
  Ptedav    0    4
  Ptemes    0   13

indicating 2 minutes Buzz in location 7716 and 2 minutes in location 7717.

Cheers
Petr
#
Tnx all for the helpful suggestions.

Life is good.
Happy holidays
Bruce