Filtering

Filtering by the metadata

One of the core methods of behavpy. This method creates a new behavpy object that only contains specimens whose metadata matches your inputted list. Use this to separate out your data by experimental conditions for further analysis.

# filter your dataset by variables in the metadata wtih .xmv()
# the first argument is the column in the metadata
# the second can be the variables in a list or as subsequent arguments

df = df.xmv('species', ['D.vir', 'D.ere', 'D.wil', 'D.sec', 'D.yak', 'D.sims'])
# or
df = df.xmv('species', 'D.vir', 'D.ere', 'D.wil', 'D.sec', 'D.yak', 'D.sims')

# the new data frame will only contain data from specimens with the selected variables

Removing by the metadata

The inverse of .xmv(). Remove from both the data and metadata any experimental groups you don't. This method can be called on individual specimens by calling 'id' and their unique identifier.

# remove specimens from your dataset by the metadata with .remove()
# remove acts like the opposite of .xmv()

df = df.remove('species', ['D.vir', 'D.ere', 'D.wil', 'D.sec', 'D.yak', 'D.sims'])
# or
df = df.remove('species', 'D.vir', 'D.ere', 'D.wil', 'D.sec', 'D.yak', 'D.sims')

# both .xmv() and .remove() can filter/remove by the unique id if the first argument = 'id'
df = df.remove('id', '2019-08-02_14-21-23_021d6b|01')

Filtering by time

Often you will want to remove the start of experiments when the data isn't as clean or you'll want to just look at the baseline data before something occurs. Use t_filter() to filter the dataset between two time points.

# filter you dataset by time with .t_filter()
# the arguments take time in hours
# the data is assumed to represented in seconds

df = df.t_filter(start_time =  24, end_time = 48)

# Note: the default column for time is 't', to change use the parameter t_column

Last updated