Todos, A função a seguir get_frequency_of_events
detecta a frequência de números consecutivos, por exemplo,
import numpy as np
aa=np.array([1,2,2,3,3,3,4,4,4,4,5,5,5,5,5])
get_frequency_of_events(aa)
isso produz o seguinte:
lista de índices no início de cada grupo [1, 3, 6, 10]
frequência de cada grupo [2, 3, 4, 5]
outro exemplo,
aa=np.array([1,1,1,np.nan,np.nan,1,1,np.nan])
idx, feq= get_frequency_of_events(aa)
lista de índices no início de cada grupo [0, 5]
frequência de cada grupo [3, 2]
No entanto, essa função é lenta, especialmente quando iterada sobre dados 3D. Como posso vetorizar tal função para obter um processamento mais rápido?
Aqui está a função
def get_frequency_of_events(mydata):
"""
Author : Shaaban
Date : Jan 22, 2025
Purpose : get the frequency of repeated consecutive numbers and their indices, this is important when finding the frequency of heatwaves and etc ... All we have to do is to build matrix of ones (or any other number), and NAN. One refers to the existence of the EVENT, and nan refers to the inexistence of the event. Then this function could give you a summary of the the frequency of the events and their associated indices.
tests :
aa=np.array([1,1,0,0,0,1,0,1,1,1,1,0,1,1])
get_frequency(aa)
aa=np.array([1,2,2,3,3,3,4,4,4,4,5,5,5,5,5])
get_frequency(aa)
aa=np.array([1,1,1,1,0,0,1,1,1])
get_frequency(aa)
aa=np.arange(10)
get_frequency(aa)
aa=np.ones(10)
get_frequency(aa)
# CAUTION CAUTION CAUTION
#For heatwave numbers, etc , make your array consits of fixed number (any number) that is associated with an evens and Nan for days/hours/month not associated with events. The trick here is that no nan could ever be equal to another nan.
aa=np.array([1,1,1,np.nan,np.nan,1,1,np.nan])
idx, feq= get_frequency(aa)
"""
index_list=[]
events_frequency_list=[]
idx_last_num=len(mydata)-1
counter=0
ii=0
while(ii <= idx_last_num-1):
#print( '@ index = '+str(ii) )
counter=0
while(mydata[ii] == mydata[ii+1]):
print(' Find match @ '+str(ii)+' & '+str(ii+1)+\
' data are '+str(mydata[ii])+' & '+str(mydata[ii+1]))
# store the index of the first match of each group.
if counter == 0:
index_list.append(ii)
ii=ii+1
counter=counter+1
# break from while if this is the last element in the array.
if ii==idx_last_num:
break
# if we just were iniside loop, store the no of events
if counter != 0:
no_events=counter+1
events_frequency_list.append(no_events)
# counter if there is no match at all for the outer while.
ii=ii+1
print('list of indices @ the begining of each group ')
print(index_list)
print(' frequency of each group.')
print(events_frequency_list)
return index_list, events_frequency_list