I've recently started working with ksql and wanted to check if someone can help me with a query design. The problem statement is that I have a video conferencing app where a broadcaster can start and pause the stream multiple times. I want to get the total played time and the total paused time for that stream. I have a click stream data which consists of start and pause timestamps. How should I go about it so that I can generate an optimized view.
Any help is very deeply appreciated :)
Thank You
Grouping events
The first problem you'll need to solve is how are you going to group start/stop events together?
Likely, you'll want to group them by some kind of USER_ID or other attribute that uniquely identifies the broadcaster that's starting/stopping the stream.
Likely, you'll also want to group by the some kind of STREAM_ID or other attribute that uniquely identifies the stream being played.
This may be sufficient, it you only want the total play time per-broadcaster, per-video. However, you may also want to take time into account. For example, if I watch a video today, and then watch it again tomorrow, is that two viewing sessions, with two independent view time totals, or do you not care?
One way of grouping events in time is using session windows. Before you sessionize the data you'd need to define the parameters that define your session. Here's a good example of using session windows in ksqlDB.
Another way of grouping events in time is using tumbling windows. Here's a good example of using tumbling windows.
Calculating play time
Once you've grouped your events, you'll likely need to calculate the play time. For example, if I start playing at time 5, and stop playing at time 8, then the amount of time I was watching the video is 5 - 8 = 3.
This requires capturing the play event and waiting for the stop event, and then outputting the difference in time. And doing some in a fault tolerant way.
At the time of writing, this would require a custom UDAF (custom user defined aggregate function).
A custom UDAF could capture the start event, store it for future reference, and output a '0' for the play time, and then when it sees the corresponding stop event it can remove the start event from its state, calculate the play time and return it.
Here's a good example of writing a custom UDF in ksqlDB, though you require a custom UDAF, which are covered here.
There is currently a PR open with an enhancement to the LATEST_BY_OFFSET method that may well serve your purpose. This enhances the method to allow it to capture the last N value, rather than just the last 1 value. Likely, this will be released in ksqlDB v0.13, and you can always pull the code and compile it locally, if you have any development experience. If it doesn't serve your purpose, then you may be able to use it as the starting point for developing your own.
Of course, these solutions requires your stream of source events to be correctly ordered, so that stop events never come before their associated play events.
Aggregating
Once you've calculated the play time between a pair of start/stop events, you'll then need to aggregate them. Here's a good example of how to aggregate in ksqlDB.
Related
Is there a way to detect a user pausing a run/activity within the strava API?
With Get Activity Streams (getActivityStreams) you can obtain differents StreamSet from your activity: in order to detect pause I think you can analyze CadenceStream or MovingStream.
Pauses are not available in the Strava API and can not be extracted consistently through algorithmic processing of the available fields. Moreover, the data contained in the API's streams collection can not be processed in a way which will arrive at the summary distance or time of the run.
The MovingStream contains a bit field which does not flag pauses, but instead (presumably) flags points where the athlete stopped moving. Although, that said, this field can not be used to arrive at the Moving Time by summing up the time values where this flag is true.
I’m developing a small game where the player owns droids used to perform some automated actions. The easiest example is giving an order to a droid to send him at a specific position. Basically, the users gives it a position and the droid goes there. I’m already using a lot Azure app function and I’d like to use them to make the droid moves.
On the top of my head, I thought about making one function that would trigger every minute, fetch all the droid that need to move then make them move.
The issue with this approach is that if the game is popular, there could be hundreds of droids and I have to ensure that the function execution time stays below the minute.
I thought about just retrieving all droids that needs to move then for each of them calling a Azure app function via its URL to make it execute for this particular droid. In my head, it would parallelize the execution a bit but I’m not sure I’m correct.
I also have to think about using sql transaction or not in order to be sure not to create deadlocks.
The final question would be « how to handle recurring treatment of potentially large amount of data and ensure that it stays below the minute ? »
Thanks for your advice
Typically, you handle such scenarios with queues. Each order becomes a queue message, and then Azure Function is triggered by it and processes the order. It can and will scale based on the amount of messages in the queue.
If your logic still requires timer-based processing, the timer should be as lean as possible, e.g. send the queue messages to a queue which would do the real work.
Tamper data
There is terrible thing called Tamper Data. It receives all POST'ing data from FLASH to PHP and give ability for user to change values.
Imagine that in flash game (written in ActionScript 3) are score points and time. After match completed score and time variables are sending to PHP and inserting to database.
But user can easy change values with Tamper Data after match completed. So changed values will be inserted to database.
My idea seems that won't work
I had idea to update data in database on every change? I mean If player get +10 score points I need instant to write It to database. But how about time? I need update my table in database every milisecond? Is that protection solution at all? If user can change POST data he can change It everytime also last time when game completed.
So how to avoid 3rd party software like Tamper Data?
Tokens. I've read article about Tokens, there is talking about how to create random string as token and compare It with database, but It's not detailed and I don't have idea how to realise It. Is that good idea? If yes, maybe someone how to realise It practically?
According to me is better way to send both parameter and value in encrypted format like score=12 send like c2NvcmU9MTI= which is base64
function encrypt($str)
{
$s = strtr(base64_encode(mcrypt_encrypt(MCRYPT_RIJNDAEL_256, md5(SALTKEY), serialize($str), MCRYPT_MODE_CBC, md5(md5(SALTKEY)))), '+/=', '-_,');
return $s;
}
function decrypt($str)
{
$s = unserialize(rtrim(mcrypt_decrypt(MCRYPT_RIJNDAEL_256, md5(SALTKEY), base64_decode(strtr($str, '-_,', '+/=')), MCRYPT_MODE_CBC, md5(md5(SALTKEY))), "\0"));
return $s;
}
In general, there is no way to protect the content generated in Flash and sent to server.
Even if you encrypt the data with a secret key, both the key and the encryption algorithm are contained in the swf file and can be decompiled. It is a bit more harder than simply faking the data so it is kind of usable solution but it will not always help.
To have full security, you need to run all game simulation on the server. For example, if player jumped and catched a coin, Flash does not send "score +10" to the server. Instead, it sends player coordinates and speed, and server does the check: where is the coin, where is the player, what is player's speed and can the player get the coin or not.
If you cannot run the full simulation on the server, you can do a partial check by sending data to server at some intervals.
First, never send a "final" score or any other score. It is very easy to fake. Instead, send an event every time the player does something that changes his score.
For example, every time player catches a coin, you send this event to the server. You may not track player coordinates or coin coordinates, but you know that the level contains only 10 coins. So player cannot catch more than 10 coins anyway. Also, player can't catch coins too fast because you know the minimum distance between coins and the maximum player speed.
You should not write the data to database each time you receive it. Instead you need to keep each player's data in memory and change it there. You can use a noSQL database for that, for example Redis.
First, cheaters will always cheat. There's really no easy solution (or difficult one) to completely prevent it. There are lots of articles on the great lengths developers have gone to discourage cheating, yet it is still rampant in nearly every game with any popularity.
That said, here are a few suggestions to hopefully discourage cheating:
Encrypt your data. This is not unbeatable, but will discourage many lazy hackers since they can't just tamper with plain http traffic, they first have to find your encryption keys. Check out as3corelib for AS3 encryption.
Obfuscate your SWFs. There are a few tools out there to do this for you. Again, this isn't unbeatable, but it is an easy way to make it harder for cheaters to find your encryption keys.
Move all your timing logic to the server. Instead of your client telling the server about time, tell the server about actions like "GAME_STARTED" and "SCORED_POINTS". The server then tracks the user's time and calculates the final score. The important thing here is that the client does not tell the server anything related to time, but merely the action taken and the server uses its own time.
If you can establish any rules about maximum possible performance (for example 10 points per second) you can detect some types of cheating on the server. For example, if you receive SCORED_POINTS=100 but the maximum is 10, you have a cheater. Or, if you receive SCORED_POINTS=10, then SCORE_POINTS=10 a few milliseconds later, and again a few milliseconds later, you probably have a cheater. Be careful with this, and know that it's a back and forth battle. Cheaters will always come up with clever ways to get around your detection logic, and you don't want your detection logic to be so strict that you accidentally reject an honest player (perhaps a really skilled player who is out-performing what you initially thought possible).
When you detect a cheater, "honey pot" them. Don't tell them they are cheating, as this will only encourage them to find ways to avoid detection.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How can I locally detect iPhone clock advancement by a user between app runs?
Is there a way to determine the actual time and date in iOS (not the time of the device)
Is there a clock in iOS that can be used that cannot be changed by the user
Brief
I am working with an auto-renewable subscription-based app. When the app receives the latest receipt from Apple, it stores the expires_date_ms key in NSUserDefaults. Thirty days after that date, the app checks with Apple to see if the subscription is still active. The app can be considered an offline app, but it must connect to the internet once every 30 days in order to check the subscription status. This time comparison will be used to tell the user he/she must connect.
Problem
I am using the code below to compare the current time with the expires_date_ms:
NSTimeInterval expDateMS = [[productInfo objectForKey:#"expires_date_ms"] doubleValue];
NSTimeInterval currentDateMS = ([[NSDate date] timeIntervalSince1970] * 1000);
if (currentDateMS > expDateMS)
subExpired = YES;
This is fine and works well, but from what I can tell there's a loophole that can be exploited - if the user sets the device's clock back a hour/month/decade, the time comparison will become unreliable because [NSDate date] uses the device's current time (please correct me if I'm wrong).
Question
Is there any way of retrieving a device-independent time in milliseconds? One that can be accurately and reliably measured with no regards to the device clock?
While Kevin and H2CO3 are completely correct, there are other solutions for the purposes of checking a subscription (which I would hope does not need millisecond accuracy....)
First watch UIApplicationSignificantTimeChangeNotification so that you get notifications of when the time changes suddenly. This will even be delivered to you if you were suspended (though I don't believe you will receive it if you were terminated). This gets called when there is a carrier time update, and I believe it is called when there is manual time update (check). It also is called at local midnight and at DST changes. The point is that it's called pretty often when the time suddenly changes.
Keep track of what time it was when you go into the background. Keep track of what time it is when you come back into the foreground. If time moves radically backwards (more than a day or two), kindly suggest that you would like access to the network to check things. Whenever you check-in with your server, it should tell you what time it thinks it is. You can use that to synchronize the system.
You can similarly keep track of your actual runtime. If it gets wildly out of sync with apparent runtime, then again, request access to the network to sync things up.
I'm certain that attackers would be able to sneak 35 days or whatever out of this system rather than 30, but anyone willing to work that hard will just crack your software and take the check out entirely. The focus here is the uncommitted attacker who is just messing with their clock. And that you can catch pretty well.
You should test this carefully, and be very hesitant to accuse the user of anything. Just connecting to your server should always be enough to get a legitimate user working again.
You need to connect to/retrieve information from a reliable, official time server and use that time data in your app. For example, here's a world time server with an easy-to use API
Here are three options I can think of:
clock_gettime(CLOCK_MONOTONIC) gets the current system uptime. This is relatively unreliable because if the user reboots, this is reset. You could save the last value used and at launch use the last saved value as an offset, but the problem with this is that the time that the device was shut off for won't be calculated.
mach_absolute_time() counts the number of CPU ticks since the last reboot. It can be fetched easily through CACurrentMediaTime. Note that this can be reset simply by rebooting the device, so if changing the time is very important, I'm not so sure if you would go this way.
Network Time Protocol (NTP) is a networking protocol for synchronizing the clocks of computer systems. In practice, all NTP is is querying a time server. An iOS library for NTP can be found here.
So the first two methods do not require connectivity, while the third does. However, the third method is the only foolproof one.
There is no such thing as a non-mutable device clock that persists across reboots. The only way to get a trustworthy time is to contact a remote server that you trust and ask what its time is.
Using ThreadPoolRuntime, I could get a throughput attiribute that means "The mean number of requests completed per second". It's not what I want. I want to get realtime figure that is not the mean number.
Requests per second is by it's nature an average, so I'm not too sure what you mean by a realtime figure - do you want the number of requests completed in the last second?
The ApplicationRuntimes/[appname]/WorkManagerRuntimes/default/CompletedRequests gives the total number of requests completed for one application, you can use this to calculate an RPS figure over whatever timeframe you want.
Unless this is a custom work manager's thread pool, the number you're going to get back isn't going to be terribly meaningful. And even in the case of a custom thread pool assigned to your particular application component (EJB, WAR file, etc) then the number still isn't likely to mean what you're looking for.
The thread pool is used to perform all work for that component (or in the case of the default thread pool, all work for the server, both internal and client-driven. This means that requests of wildly different 'cost' in terms of CPU and execution time go through the same pool.
What is the problem that you're trying to solve? Is it an understanding of how many requests per second are occurring for particular application components? You might want to look at WLDF as an alternative source for this kind of data, although in either case you'll need to post-process information to get something useful.