How can I show the most recent events per user with Keen IO? - keen-io

Suppose you have a Keen IO collection called "survey-completed" that contains events matching the following pattern:
keen.id: <unique autogenerated id>
keen.timestamp: <autogenerated overridable timestamp>
userId: <hex string for user>
surveyScore: <integer from 1 to 10>
...
How would you create a report of only the most up-to-date satisfaction score for each user that responded to one or more surveys within a given amount of time (such as one week)?

There isn't a really elegant way to make it happen, but for a given userId you could successfully return your the most up-to-date event create a count query with a group_by on [surveyScore, keen.timestamp] and an order_by on the keen.timestamp property. You will want to set limit=1 to select only the most recent surveyScore.
If you'd like to use an extraction, the most straight forward way would be to run an extraction with property_names set to ["userId","keen.timestamp","surveyScore"]. Once you receive the results you can then do some client-side post processing. This is probably the best way if you want to take a look at all of your userIds.
If you're interested in a given userId and want to use an extraction, you can run an extraction with a filter on the userId eq X, define the optional parameter latest set to latest=1. The latest property is an integer containing the number of most recent events to extract. Note: The use of latest will call upon the keen.created_at timestamp instead of keen.timestamp (https://keen.io/docs/api/#the-keen-object).

Related

splunk date time difference

I am new to Splunk. My goal is to optimize the API call, since that particular API method is taking more than 5 minutes to execute.
In Splunk I searched using context ID, I got all the functions and sub functions call by main API call function for that particular execution. Now I want to figure what which sub function took the maximum time. In Splunk in left side, in the list of fields, I see field name CallStartUtcTime (e.g. "2021-02-12T20:17:42.3308285Z") and CallEndUtcTime (e.g. "2021-02-12T20:18:02.3702937Z"). In search how can I write a function which will give me difference between these two times. I google and found we can use eval() function but for me its returning null value.
Additional Info:
search:
clicked on "create table view" and checked start, end and diff fields in the left side fields list. but all three are coming as null
not sure what wrong I am doing. I want to find out the time taken by each function.
Splunk cannot compare timestamps in string form. They must be converted to epoch (integer) form, first. Use the strptime() function for that.
...
| eval start = strptime(CallStartUtcTime, "%Y-%m-%dT%H:%M:%S.%7N%Z")
| eval end = strptime(CallEndUtcTime, "%Y-%m-%dT%H:%M:%S.%7N%Z")
| eval diff = end - start
...

How to do math operation with columns on grouped rows

I have Event model with following attributes (I quoted only problem related attributes), this model is filled periodically by API call, calling external service (Google Calendar):
colorid: number # (0-11)
event_start: datetime
event_end: datetime
I need to count duration of grouped events, grouped by colorid. I have Event instance method to calculate single event duration:
def event_duration
((event_end.to_datetime - event_start.to_datetime) * 24 * 60 ).to_i
end
Now, I need to do something like this:
event = Event.group(:colorid).sum(event_duration)
But this doesnot work for me, as long as I get error that event_duration column doesnot exists. My idea is to add one more attribute to Event model "event_duration", and count and update this attribute during Event record creation, in this case I would have column called "event_duration", and I might be ale to use sum on this attribute. But I am not sure this is good and "system solution", as long as I would like to have model data reflecting "raw" received data from API call, and do all math and statistics on the top of model data.
event_duration is instance method (not column name). error was raised because Event.sum only calculates the sum of certain column
on your case, I think it would be easier to use enumerable methods
duration_by_color_id = {}
grouped_events = Event.all.group_by(&:colorid)
grouped_events.each do |colorid, events|
duration_by_color_id[colorid] = events.collect(&:event_duration).sum
end
Source :
Enumerable's group_by
Enumerable's collect

order_by() method not working in peewee

I am using a SQLite backend with a simple show - season - episode schema:
class Show(BaseModel):
name = CharField()
class Season(BaseModel):
show = ForeignKeyField(Show, related_name='seasons')
season_number = IntegerField()
class Episode(BaseModel):
season = ForeignKeyField(Season, related_name='episodes')
episode_number = IntegerField()
and I would need the following query :
seasons = (Season.select(Season, Episode)
.join(Episode)
.where(Season.show == SHOW_ID)
.order_by(Season.season_number.desc(), Episode.episode_number.desc())
.aggregate_rows())
SHOW_ID being the id of the show for which I want the list of seasons.
But when I iterate over the query with the following code :
for season in seasons:
for episode in season.episodes:
print(episode.episode_number)
... I get something which is not ordered at all, and which does not even follow the order I would get without using order_by(), i.e. the insertion order.
I activated the debug logs to see the outgoing query, and the query does contain the ORDER BY clause, and manually applying it returns the proper descending order.
I am new to peewee, and I have seen so many examples making use of a join() combines with an order_by(), but I can still not find out what I am doing wrong.
This was due to a bug in the processing of nested collections in the aggregate query result wrapper.
The github issue is: https://github.com/coleifer/peewee/issues/519
The fix has been merged here: https://github.com/coleifer/peewee/commit/ec0e87f1a480695d98bf1f0d7f2e63aed8dfc440
So, to get the fix you'll need to either clone master or wait til the next release which should be in the next week or two (2.4.7).

Rails show different object every day

I want to match my user to a different user in his/her community every day. Currently, I use code like this:
#matched_user = User.near(#user).order("RANDOM()").first
But I want to have a different #matched_user on a daily basis. I haven't been able to find anything in Stack or in the APIs that has given me insight on how to do it. I feel it should be simpler than having to resort to a rake task with cron. (I'm on postgres.)
Whenever I find myself hankering for shared 'memory' or transient state, I think to myself "this is what (distributed) caches were invented for".
#matched_user = Rails.cache.fetch(#user.cache_key + '/daily_match', expires_in: 1.day) {
User.near(#user).order("RANDOM()").first
}
NOTE: While specifying a TTL for cache entry tells Rails/the cache system to try and keep that value for the given timeframe, there's NO guarantee that it will. In particular, a cache that aggressively tries to reclaim memory may expire an entry well before its desired expires_in time.
For this particular use case, it shouldn't be a big deal but in cases where the business/domain logic demands periodically generated values that are durable then you really have to factor that into your database.
How about using PostgreSQL's SETSEED function? I used the date to seed so that every day the seed will change, but within a day, the seed will be consistent.:
User.connection.execute "SELECT SETSEED(#{Date.today.strftime("%y%d%m").to_i/1000000.0})"
#matched_user = User.near(#user).order("RANDOM()").first
You may want to seed a random value after using this so that any future calls to random aren't biased:
random = User.connection.execute("SELECT RANDOM()").to_a.first["random"]
# Same code as above:
User.connection.execute "SELECT SETSEED(#{Date.today.strftime("%y%d%m").to_i/1000000.0})"
#matched_user = User.near(#user).order("RANDOM()").first
# Use random value before seed to make new seed:
User.connection.execute "SELECT SETSEED(#{random})"
I have split these steps in different sections just for readability. you can optimise query later.
1) Find all user records till today morning. so that the count will freeze.
usrs_till_today_morning = User.where("created_at <?", DateTime.now.in_time_zone(Time.zone).beginning_of_day)
2) Pluck all ID's
user_ids = usr_till_today_morning.pluck(:id)
3) Today date it will be a range (1..30) but will remain constant throughout the day.
day_today = Time.now.day
4) Select the same ID for the day
todays_user_id = user_ids[day_today % user_ids.count]
#matched_user = User.find(todays_user_id)
So it will give you random user records by maintaining same record throughout the day!!

Magento Bulk update attributes

I am missing the SQL out of this to Bulk update attributes by SKU/UPC.
Running EE1.10 FYI
I have all the rest of the code working but I"m not sure the who/what/why of
actually updating our attributes, and haven't been able to find them, my logic
is
Open a CSV and grab all skus and associated attrib into a 2d array
Parse the SKU into an entity_id
Take the entity_id and the attribute and run updates until finished
Take the rest of the day of since its Friday
Here's my (almost finished) code, I would GREATLY appreciate some help.
/**
* FUNCTION: updateAttrib
*
* REQS: $db_magento
* Session resource
*
* REQS: entity_id
* Product entity value
*
* REQS: $attrib
* Attribute to alter
*
*/
See my response for working production code. Hope this helps someone in the Magento community.
While this may technically work, the code you have written is just about the last way you should do this.
In Magento, you really should be using the models provided by the code and not write database queries on your own.
In your case, if you need to update attributes for 1 or many products, there is a way for you to do that very quickly (and pretty safely).
If you look in: /app/code/core/Mage/Adminhtml/controllers/Catalog/Product/Action/AttributeController.php you will find that this controller is dedicated to updating multiple products quickly.
If you look in the saveAction() function you will find the following line of code:
Mage::getSingleton('catalog/product_action')
->updateAttributes($this->_getHelper()->getProductIds(), $attributesData, $storeId);
This code is responsible for updating all the product IDs you want, only the changed attributes for any single store at a time.
The first parameter is basically an array of Product IDs. If you only want to update a single product, just put it in an array.
The second parameter is an array that contains the attributes you want to update for the given products. For example if you wanted to update price to $10 and weight to 5, you would pass the following array:
array('price' => 10.00, 'weight' => 5)
Then finally, the third and final attribute is the store ID you want these updates to happen to. Most likely this number will either be 1 or 0.
I would play around with this function call and use this instead of writing and maintaining your own database queries.
General Update Query will be like:
UPDATE
catalog_product_entity_[backend_type] cpex
SET
cpex.value = ?
WHERE cpex.attribute_id = ?
AND cpex.entity_id = ?
In order to find the [backend_type] associated with the attribute:
SELECT
  backend_type
FROM
  eav_attribute
WHERE entity_type_id =
  (SELECT
    entity_type_id
  FROM
    eav_entity_type
  WHERE entity_type_code = 'catalog_product')
AND attribute_id = ?
You can get more info from the following blog article:
http://www.blog.magepsycho.com/magento-eav-structure-role-of-eav_attributes-backend_type-field/
Hope this helps you.