Configuring m3u8 file to run with ts segments that don't change the names - live-streaming

i don't know much about this and how to configure it.
I have 4 segments that don't change the name, they are (segment1.ts, segment2.ts, segment3.ts, segment4.ts) but they are updated every 5 seconds (but keeps same name).
i think the m3u8 it's configured right, only the MEDIA-SEQUENCE that is wrong, and i don't know how should i configure it, any helps??
#EXT-X-TARGETDURATION:5
#EXT-X-VERSION:3
#EXT-X-MEDIA-SEQUENCE:28578
#EXTINF:5.000,
segment1.ts
#EXTINF:5.000,
segment2.ts
#EXTINF:5.000,
segment3.ts
#EXTINF:5.000,
segment4.ts

You need to refresh the playlist each time a new segment is available. The client (player) will reload the playlist periodically to check for new segments.
If you only keep a limited number of segments you will use a fixed-size sliding window: the oldest segment is removed once a new one is refreshed and added. You generally want to maintain the same playlist duration and not the segment count since segments can have different durations.
EXT-X-MEDIA-SEQUENCE must be incremented by one for each segment removed from the playlist, not when adding as mentioned in the comments.
You cannot reuse only 4 segments for a 4 segment sliding window. Each time you remove a segment from the playlist it still needs to be available on the server for a time equal to the duration of the segment plus the maximum playlist duration published by the server so far.

Related

report scheduler system design using database as master

Problem
we have ~50k scheduled financial reports that we periodically deliver to clients via email
reports have their own delivery frequency (date&time format - as configured by clients)
weekly
daily
hourly
weekdays only
etc.
Current architecture
we have a table called report_metadata that holds report information
report_id
report_name
report_type
report_details
next_run_time
last_run_time
etc...
every week, all 6 instances of our scheduler service poll the report_metadata database, extract metadata for all reports that are to be delivered in the following week, and puts them in a timed-queue in-memory.
Only in the master/leader instance (which is one of the 6 instances):
data in the timed-queue is popped at the appropriate time
processed
a few API calls are made to get a fully-complete and current/up-to-date report
and the report is emailed to clients
the other 5 instances do nothing - they simply exist for redundancy
Proposed architecture
Numbers:
db can handle up to 1000 concurrent connections - which is good enough
total existing report number (~50k) is unlikely to get much larger in the near/distant future
Solution:
instead of polling the report_metadata db every week and storing data in a timed-queue in-memory, all 6 instances will poll the report_metadata db every 60 seconds (with a 10 s offset for each instance)
on average the scheduler will attempt to pick up work every 10 seconds
data for any single report whose next_run_time is in the past is extracted, the table row is locked, and the report is processed/delivered to clients by that specific instance
after the report is successfully processed, table row is unlocked and the next_run_time, last_run_time, etc for the report is updated
In general, the database serves as the master, individual instances of the process can work independently and the database ensures they do not overlap.
It would help if you could let me know if the proposed architecture is:
a good/correct solution
which table columns can/should be indexed
any other considerations
I have worked on a differt kind of sceduler for a program that reported analyses on a specific moment of the month/week and what I did was combining the reports to so called business cycle based time moments. these moments are on the "start of a new week", "start of the month", "start/end of a D/W/M/Q/Y'. So I standardised the moments of sending the reports and added the id's to a table that would carry the details of the report. - now you add thinks to the cycle of you remove it when needed, you could do this by adding a tag like(EOD(end of day)/EOM (End of month) SOW (Start of week) ect, ect, ect,).
So you could index the moments of when the clients want to receive the reports and build on that track. Hope that this comment can help you with your challenge.
It seems good to simply query that metadata table by all 6 instances to check which is the next report to process as you are suggesting.
It seems odd though to have a staggered approach with a check once every 60 seconds offset by 10 seconds for your servers. You have 6 servers now but that may change. Also I don't understand the "locking" you are suggesting, why now simply set a flag on the row such as [State] = "processing", then the next scheduler knows to skip that row and move on to the next available one. Once a run is processed, you can simply update a [Date_last_processed] column, or maybe something like [last_cycle_complete] = 'YES'.
Alternatively you could have one server-process to go through the table, and for each available row, sends it off to one of the instances, in a round-robin fashion (or keep track of who is busy and who isn't).

Randomly select DynamoDB entry

I'm have a DynamoDB table called URLArray that contains a list of URL's (myURL) and a unique video name (myKey).
I need to do two things:
When a user clicks the next video button, a random entry needs to be selected from this URLArray. There could be potentially tens of thousands of rows.
The user is logged into the app. Everytime they finish watching a video, the video's unique video name is recorded. So....when the user has seen a video, its added to a list in a table called Users under the user's info row.
Soo...This random entry that gets selected when the user clicks the next video button in point 1, has to be compared to the list of videos they've already seen. To make sure that it doesn't randomly appear again for that particular user.
I do something woefully inefficient so far, that works, but it's not great:
By the way i'm using AppSync + GraphQL to interact with the DynamoDB table. I first get a local copy of the URLArray:
//Gets a list of the Key/URL pairs in the UrlArrays table in GraphQL ****IN CONSTRUCTOR, so we have this URLArray data when componentDidMount()****
listUrlArrays = async () => {
try {
URLData = await API.graphql(graphqlOperation(ListUrlArrays)); //GraphQL query
//URLData[] is available in the entire class
this.setState({urlArrayLength: apiData.data.listURLArrays.items.length}); //gets the length of URLArray (i.e. how many videos are in the database)
}
}
As an overview, when user clicks for the next video:
//When clicking next video
async nextVideo(){
await this.logVideosSeen(); //add myKey to the list of videos in *Users* table the logged in user has now seen
await this.getURL(); //get the NEXT upcoming video's details, for Video Player to play and make sure it's not been seen before
}
//This will update the 'listOfVideosSeen[]' in Users table with videos unique myKey, the logged in user has seen
logVideosSeen = async () => {
.......
}
async getURL() {
var dbIndex = this.getUniqueRandomNumber(this.state.urlArrayLength); //Choose a number between 0 and N number of videos in URLArray
//the hasVideoBeenSeen() basically gets the list of videos a user has already seen from `Users` table with the GraphQL getusers command, and creates a local copy of this list (can get big). I use javascripts indexOf() to check whether myKey already exists in the list
while(await this.hasVideoBeenSeen(this.state.URLData[dbIndex].myKey)) //while true i.e. user has seen that video before
{
dbIndex = this.getUniqueRandomNumber(this.state.urlArrayLength); //get another random number to fetch a new myKey
}
//If false, we'll exit the loop and know we've got a not seen before myKey, proceed to set to play...
if(dbIndex != null){
this.setState({ playURL: this.state.URLData[dbIndex].vidURL }); //Retrieve the URL from the local URLArray that we're going to play (i.e. the next video to come)
}
}
I can share a little more code if needed, but essentially I wanted to know how to:
Let a Lambda function select a random number based on the current URLArray size (i may need to keep a local copy of URLArray anyway). But i think point 2 here is where it's really inefficient..
Let a Lambda function check (the while loop) against the Users table whether myKey has already been seen. Mainly to shift this computational burden to the cloud instead of the local device the app runs on.
AFTER A THINK................
Thanks for the suggestion Seth. I have been thinking about it for some time, and while the randomness requirement still holds true, I think there is some truth in what you’ve suggested. The reason I need randomness, is so that 2 users sat side by side for example, can’t predict which video is coming next. It shouldn’t be a predictable sequence of videos. I'm not sure I can use Scan function with AWS Amplify/GraphQL. So remember there’s 2 things going on here: (1) a video upload, recording it in the URLArray sensibly for future reference. (2) users viewing a previously unseen random video and then moving onto another unseen random video
*(1)
I like your idea of using a number to index the URLArray, and it’s helped to make life a bit easier. So the first URL being at index 0, the next at 1 etc…
My thinking here (to avoid me doing a ListUrlArrays() and bringing the WHOLE array locally to the phone), is to create a GSI called VideoNumber for the URLArray table. This will be the unique VideoNumber column with a number 0-N. So imagine the diagram above having another column called VideoNumber. Row 1 having VideoNumber set to 0, Row 2 having VideoNumber set to 1 etc… THEN all I would need to do, is locally on the device, generate a random number between 0-N, call a getURLArrayIdbyVideoNumber() query specific for that GSI, with the number that we just generated, and it’ll unlock the information I need from the row. Voila! I think that shifts most of that heavy burden away now.
Question: Before each video is uploaded, how do I easily get the current total number of rows N in the table (or row count)? I would then increment it by one.
The other thing I can do is save this current count number in another DynamoDB table that I use for persisted data, read the number from there before upload, and write an N+1 after upload to increment it (2 DynamoDB operations per upload). It’s not ideal.
*(2)
When a user has finished watching a video, I can log in a list (under the users information in DynamoDB), which video’s they’ve already seen. So for example this could now be a seen list: [3,12,73,108,57] for the 5 videos they’ve seen so far. When the user clicks nextVideo() we’ll generate a random newNumber, and straight away compare that with any number in the seen list. I use seenlist.indexOf(newNumber) and it will, either go again or stop if the newNumber doesn’t exist in the list. THEN I can go through the GSI query, and retrieve the relevant information to display the video from URLArray.
I think that this indexOf() is the biggest computational burden on the device, and obviously gets a little slower as the seenList increases. But it should be quicker with pure integer numbers then an alphanumeric myKey as I was using before. Any other suggestion would be welcome :)
I’ve yet to try it, but it was just an idea, as I need to keep the random element. But first, do you know how I can easily find the number of rows or table count of URLArray?
I think you'll have an easier time coming up with a solution to this problem if you drop the randomness requirement. It sounds like the more important requirement is presenting the user with a video they haven't seen before.
If that's correct, it sounds like your access pattern could be stated as
Fetch previously unseen video for user
which is an easier problem to solve.
Unlike SQL databases, there are often many ways to implement a given access pattern in DynamoDB. My answer here is just one way.
Imagine your URLArray table as a giant array. The first URL is at index 0, the next URL is at index 1, the second URL at index 2, and so on. Each user of your application would start by watching the video at URL index 0, then URL index 1, etc. This would ensure the user never sees the same video twice. You would not need to store a list of all the videos they've seen. Instead, you could store the index of the last video they saw.
Your application could grab the first n videos from the table to present to your users. Once that list was exhausted, it could go grab the next n videos. And so on...
What I've described here is essentially how pagination is implemented in DynamoDB. To bring this abstraction back to the world of DynamoDB, your algorithm could look something like this:
Scan the URLArray table for the first "page" of URLs (a scan operation with no filter criteria)
Along with the results, DynamoDB will respond with a LastEvaluatedKey, which will allow you to retrieve the next page of results starting from this position
Present your user with each video you pulled back from the scan operation, making sure to record the id (the Primary Key) the the last video they saw.
When you exhaust the URLs from step 1, execute another scan operation with the ExclusiveStartKey set to the LastEvaluatedKey returned from step 2.
When users return to your application, query for the next page from the URLArray table with ExclusiveStartKey set to the id of the last video they viewed.
This effectively uses the scan operation to search through your URLArray table one page at a time. Your application would effectively be searching the table from top to bottom, keeping track of where each user is at any given time. When a user revisits your application, just start where they left off.
In response to your edit:
If your use case requires the next video to be unpredictable (e.g. no 2 users can predict what video is next), you have a few problems to solve at the same time:
Selecting an item in an unpredictable/random manner
Tracking what a user has already seen
Putting those two requirements together makes for a tricky access pattern. Let's say you have N videos in your table, and the user has viewed N-1 of these videos leaving only one video unseen. If you are fetching your next video randomly and need to ensure it has not yet been seen, how will you find the last unseen video? How many times would you need to guess before you came across the only unseen video? What query/scan operation could you perform that does this in a single request to DDB? I'm not saying it's impossible, it's just...complicated.
I think it's better to generate a strategy that is unpredictable to the user, but predictable to you when it comes to select the next unseen video.
For example, you could pre-calculate a random order of indexes from 1..N ahead of time, which would represent the order you present the videos for a given user. You could go through that list sequentially, keeping track of the last seen index. That way, you'd always know which video was next and that the video hadn't previously been seen by this user. Fetching that video would be a simple query operation to DDB.
You also asked how to find the number of items in DynamoDB. Unfortunately, there is no DynamoDB equivalent of the SQL count operation. The answer to this question is not straightforward. For the benefit of the community (and to get a diverse set of answers), I'd suggest you make a separate question on Stackoverflow regarding the number of items in a DDB table.

How to store constantly coordinates in a DB, and then update/replace on changes?

I want to store GPS coordinates from any device into a DB on a SQL SERVER, and check it in real time on a web, that will ask constantly for positions.
I saw others questions and answers (on StackOverflow and Google), and everybody want to add new rows (with the coordinates) on a table, where there were already stored previous coordinates.
In my case, I don't want to store previous coordinates, I just want to know where are they NOW, so I think it has no point to add new rows.
Therefore, the numbers of rows will remain constant.
Since I had two tables: DEVICES(idDevice, device) and COORDINATES(device, long, lat), everytime a device send a new position (let's say every 1 second), it value will UPDATE the row existing with its previous value.
My question: Is that the best way (the "continuously auto-replacement" technique) I can do that? or is there a more optimal way to update positions?
And, like a 2nd question: that's the best way to build the tables for what I want to do?
If you are definitely storing only one set of coordinates then I would suggest you remove COORDINATES and use DEVICES(idDevice, device, long, lat). You must already be handling making sure that a DEVICES row exists so now you can simply UPDATE DEVICES SET long = xxx, lat = yyy WHERE idDevice = deviceId.

Dealing with gaps in timeline

I'm looking for some assistance to sort out the logic for how I am going to deal with gaps in a feed timeline, pretty much like what you would see in various Twitter clients. I am not creating a Twitter client, however, so it won't be specific to that API. I'm using our own API, so I can possibly make some changes to the API as well to accomodate for this.
I'm saving each feed item in Core Data. For persistance, I'd like to keep the feed items around. Let's say I fetch 50 feed items from my server. The next time the user launches the app, I do a request for the latest feed items and I am returned with 50 feed items and do a fetch to display the feed items in a table view.
Enough time may have passed between the two server requests that a time gap exists between the two sets of feed items.
50 new feed items (request 2)
----- gap ------
50 older feed items (request 1)
* end of items in core data - just load more *
I keep track of whether a gap exists by comparing the oldest timestamp for the feed items in request 2 with the newest timestamp in set of feed items from request 1. If the oldest timestamp from request 2 is greater then the newest timestamp from request 1 I can assume that a gap exists and I should display a cell with a button to load 50 more. If the oldest timestamp from request 2 is less than or equal to the newest timestamp from request 1 the gap has been filled and there's no need to display the loader.
My first issue is the entire logic surrounding keeping track of whether or not to display the "Load more" cell. How would I know where to display this gap? Do I store it as the same NSManagedObject entity as my feed items with an extra bool + a timestamp that lies in between the two above and then change the UI accordingly? Would there be another, better solution here?
My second issue is related to multiple gaps:
50 new feed items
----- gap ------
174 older feed items
----- gap ------
53 older feed items
* end of items in core data - just load more *
I suppose it would help in this case to go with an NSManagedObject entity so I can just do regular fetches in my Core Data and if they show up amongst the objects, then display them as loading cells and remove them accordingly (if gaps no longer exist between any sets of gaps).
I'd ultimately want to wipe the objects after a certain time has passed as the user probably wouldn't go back in time that long and if they do I can always fetch them from my server if needed.
Any experiences and advice anybody has with this subject is be greatly appreciated!

How to clean up inactive players in redis?

I'm making a game that uses redis to store game state. It keeps track of locations and players pretty well, but I don't have a good way to clean up inactive players.
Every time a player moves (it's a semi-slow moving game. Think 1-5 frames per second), I update a hash with the new location and remove the old location key.
What would be the best way to keep track of active players? I've thought of the following
Set some key on the user to expire. Update every heartbeat or move. Problem is the locations are stored in a hash, so if the user key expires the player will still be in the same spot.
Same, but use pub/sub to listen for the expiration and finish cleaning up (seems overly complicated)
Store heartbeats in a sorted set, have a process run every X seconds to look for old players. Update score every heartbeat.
Completely revamp the way I store locations so I can use expire.. somehow?
Any other ideas?
Perhaps use separate redis data structures (though same database) to track user activity
and user location.
For instance, track users currently online separately using redis sets:
[my code snippet is in python using the redis-python bindings, and adapted from example app in Flask (python micro-framework); example app and the framework both by Armin Ronacher.]
from redis import Redis as redis
from time import time
r1 = redis(db=1)
when the function below is called, it creates a key based on current unix time in minutes
and then adds a user to a set having that key. I would imagine you would want to
set the expiry at say 10 minutes, so at any given time, you have 10 keys live
(one per minute).
def record_online(player_id):
current_time = int(time.time())
expires = now + 600 # 10 minutes TTL
k1 = "playersOnline:{0}".format(now//60)
r1.sadd(k1, player_id)
r1.expire(k1, expires)
So to get all active users just union all of the live keys (in this example, that's 10
keys, a purely arbitrary number), like so:
def active_users(listOfKeys):
return r1.sunion(listOfKeys)
This solves your "clean-up" issues because of the TTL--the inactive users would not appear in your live keys because they constantly recycle--i.e., in active users are only keyed to old timestamps which don't persist in this example (but perhaps are written to a permanent store by redis before expiry). In any event, this clears inactive users from your active redis db.