How can I know if I've already played a segment of an M3U8 video? - live-streaming

I'm writing an m3u8 player and have a small issue. One m3u8 video I'm trying to play returns a media sequence that has nothing to do with the segment file names and the file names repeat themselves forever. How can I know if I already played a given segment?
This is what the requests look like over a few seconds:
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-MEDIA-SEQUENCE:5609
#EXT-X-ALLOW-CACHE:YES
#EXT-X-TARGETDURATION:10
#EXTINF:10.000000,
channel001.ts
#EXTINF:10.000000,
channel000.ts
Then a few seconds later:
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-MEDIA-SEQUENCE:5610
#EXT-X-ALLOW-CACHE:YES
#EXT-X-TARGETDURATION:10
#EXTINF:10.000000,
channel000.ts
#EXTINF:10.000000,
channel001.ts
Then again a few seconds later:
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-MEDIA-SEQUENCE:5611
#EXT-X-ALLOW-CACHE:YES
#EXT-X-TARGETDURATION:10
#EXTINF:10.000000,
channel001.ts
#EXTINF:10.000000,
channel000.ts
so the segment names are the same, the media sequence doesn't tell me much. How can I know if I already played those specific segments?
Thanks.

The segment names don't matter, you always use the media sequence. MEDIA-SEQUENCE is incremented each time a segment is removed from the playlist.
3. Media Segments
...
Each segment in a Media Playlist has a unique integer Media Sequence
Number. The Media Sequence Number of the first segment in the Media
Playlist is either 0 or declared in the Playlist (Section 4.3.3.2).
The Media Sequence Number of every other segment is equal to the
Media Sequence Number of the segment that precedes it plus one.
and
6.3.5. Determining the Next Segment to Load
...
The first segment to load is generally the segment that the client
has chosen to play first (see Section 6.3.3).
In order to play the presentation normally, the next Media Segment to
load is the one with the lowest Media Sequence Number that is greater
than the Media Sequence Number of the last Media Segment loaded.
RFC 8216 - HTTP Live Streaming

Related

Get playlist ID from youtube video

Is there any way to find playlist ID of youtube video using API, like through videoID? I looked into /playlistItems endpoint but I'm not sure where I can find the playlist item ID? I've tried to look this over everywhere but I'm at lost.
I looked a bit, and the only method I can find would be limited to your own playlists and playlists on the video's channel (or other channels that you specify):
Get the video ID
If you want to check playlists on the same channel, get the channel ID from the video (see https://developers.google.com/youtube/v3/docs/videos#resource)
Get the playlists you want to check (https://developers.google.com/youtube/v3/docs/playlists/list) - either your own playlists or playlists on
Loop through and get the items on each playlist (https://developers.google.com/youtube/v3/docs/playlistItems/list as you found)
Look for the playlist item that has the same video ID as the one you're examining (https://developers.google.com/youtube/v3/docs/playlistItems#resource)
It's a bit ugly, and limited of course - maybe someone else will share a better method with us.
An alternative could be:
Since you have the video_id, get the title of the video as well.
With the title of the video, use the search:list endpoint for search playlists that matches with the query/criteria - that is, the title of the video.
Loop the results from the search request and use the code I show in my answer for check if the video_id is on the playlist.
Example:
video_id: eJjbnFZ6yA8
title: FULL MATCH - The Rock vs. Mankind – WWE Championship Match: Raw, Jan. 4, 1999
Make search to get playlists that matches the search term: "FULL MATCH - The Rock vs. Mankind – WWE Championship Match: Raw, Jan. 4, 1999"
URL:
https://youtube.googleapis.com/youtube/v3/search?part=id%2Csnippet&maxResults=50&q=FULL%20MATCH%20-%20The%20Rock%20vs.%20Mankind%20%E2%80%93%20WWE%20Championship%20Match%3A%20Raw%2C%20Jan.%204%2C%201999&type=playlist&key=[YOUR_API_KEY]
Try-it here
From the results of the search, get the playlist - for this example I took the first result I got from the search, then, I check if the video_id eJjbnFZ6yA8 is on the playlist PLAamU2iv-fSuxZrVqQIBcrrZMTCMbBt2W
URL:
https://youtube.googleapis.com/youtube/v3/playlistItems?part=id%2Csnippet&playlistId=PLAamU2iv-fSuxZrVqQIBcrrZMTCMbBt2W&videoId=eJjbnFZ6yA8&key=[YOUR_API_KEY]
Try-it here for check the results.
Keep in mind that the search:list endpoint consumes 100 quota points, so, the quota might be drain rather quickly - depending of the intensity of the search.

How to add shape file in map using arcgis javascript even if shape file exceeded the maximum number of records allowed i.e 1000

I am new to arcgis.I am referring documentation of arcgis javascript 3.19 API.I have taken example from that documentation for adding shape file but when I added zip file which contains .shx,.mdf etc file it gives me error like "The maximum number of records allowed (1000) has been exceeded".
Limitations
Files containing more than 1,000 features cannot be added to a map
it's a limitation according to the documentation shapefiles
Link found on the sample app Add Shapefile
How about spliting your file to <1000 shapes ?

How to split a PDF based on a size limit?

I have searched many places but unable to find a pretty good solution as such.
So what I am trying to achieve is as below:
My program will have quite a lot of PDF docs which I will have to send via mail. There is a mail server limitation of 4 MB. So if all the PDFs are less than 4 MB it will be sent as a single mail. Else I will have to create multiple files each less than 4 MB.
Now my program works fine for the following cases:
1: Lots of files but each less than 4MB and hence keeping a tab during merging so that none of the merged files get over 4MB.
2: All files are pretty small and hence merging them together does not go to 4MB limit.
But there can be a scenario where there is one file which is, say, 14MB. I can split that document by pages. But that is also not a good solution as the pagesize is also not evenly distributed across the pages. I have used iText and PDFBox. Any help/pointer will be highly appreciated!
Imagine a 3000 KB document with ten pages and the following objects:
four font subsets used on every page, each about 50 KB
ten images that figure on a single page, each about 200 KB (one image per page)
four images that figure on every page, each about 50 KB
ten pages with content streams of about 25 KB each
about 350 KB for objects such as the catalog, the info dictionary, the page tree, the cross-reference table, etc...
A single page will need at least:
- the four font subsets: 4 times 50 KB
- the single image: 1 time 200 KB
- the four images: 4 times 50 KB
- a single content stream: 1 time 50 KB
- a slightly reduced cross-reference table, a slightly reduced page tree, an almost identical catalog, an info dictionary of identical size,... 200 KB
Together that's 850 KB. This means that you end up with 8500 KB (10 times 850 KB) if you split up a 10-page 3000 KB PDF document into 10 separate pages.
This example is the result of guess work (based on experience) and it assumes that the PDF is predictable. Most PDFs aren't:
some pages will require high-definition images (maybe even megaBytes), other pages won't have any images,
some pages will need many different fonts and font subsets (lots of kiloBytes), other pages will consist of merely some vector drawings (tiny content stream if compressed).
different pages can share a large amount of resources (Form XObjects, Image XObjects,...), other pages won't share any resources.
and so on...
You have noticed that yourself, as you write: I can split that document by pages. But that is also not a good solution as the pagesize is also not evenly distributed across the pages.
That's exactly why your question can have no other answer than: you'll have to do trial and error. No software can predict how much space is needed by a page before you look at what is needed by that page.
Update:
As David indicates in the comments, it is possible to calculate all the resources needed for a page, and to check if the current resources plus the needed resources exceed the maximum file size.
I have written a small example:
public void manipulatePdf(String src, String dest)
throws IOException, DocumentException {
Document document = new Document();
PdfCopy copy = new PdfSmartCopy(document, new FileOutputStream(dest));
document.open();
PdfReader reader = new PdfReader(src);
for (int i = 1; i <= reader.getNumberOfPages(); i++) {
// check resources needed for reader.getPageN(i);
copy.addPage(copy.getImportedPage(reader, i));
System.out.println("After adding page: " + copy.getOs().getCounter());
}
document.close();
System.out.println("After closing document: " + copy.getOs().getCounter());
reader.close();
}
I have executed the example on a PDF sample with 18 pages and this was the output:
After adding page: 56165
After adding page: 111398
After adding page: 162691
After adding page: 210035
After adding page: 253419
After adding page: 273429
After adding page: 330696
After adding page: 351564
After adding page: 400351
After adding page: 456545
After adding page: 495321
After adding page: 523640
After adding page: 576468
After adding page: 633525
After adding page: 751504
After adding page: 907490
After adding page: 957164
After adding page: 999140
After closing document: 1002509
You see how the file size of the copy gradually grows with each page that is added. After all pages are added, the size is 999140 bytes, and then the page tree and cross-reference stream are written, adding another 3369 bytes.
Where it says // check resources needed for reader.getPageN(i);, you could make a guesstimate of the size that will be added for the page and break out of the loop if it exceeds a maximum value.
Why would this be a guesstimate:
You could be counting objects that are already added. If you keep track of the objects (not that difficult), your guess will be more accurate.
I'm using PdfSmartCopy. Suppose that there are two identical objects inside your PDF. Bad PDF software often causes such problems. For instance: the same image bytes are added twice to the file. PdfSmartCopy can detect this and will reuse the first object it encounters instead of adding the redundant bytes of the extra object.
We currently don't have a reader.getTotalPageBytes() in PdfReader because PdfReader tries to use as little memory as possible. It won't load any objects into memory as long as these objects aren't needed. Hence it doesn't know the size of each object before the page is imported.
However, I'll make sure that such a method is added in the next release.
Update:
In the next version, you'll find a tool named SmartPdfSplitter that depends on a new class named PdfResourceCounter. You can use it like this:
PdfReader reader = new PdfReader(src);
SmartPdfSplitter splitter = new SmartPdfSplitter(reader);
int part = 1;
while (splitter.hasMorePages()) {
splitter.split(new FileOutputStream("results/merge/part_" + part + ".pdf"), 200000);
part++;
}
reader.close();
Note that this can result in a single-page PDF that exceeds the limit (which was set to 200000 bytes in the code sample) in case that single page can not be reduced to less bytes. In that case, splitter.isOverSized() will return true and you'll have to find another way to reduce the PDF.
PDF Clown supports page data size prediction without need of trial and error: since 2010 it has been featuring a dedicated method (org.pdfclown.tools.PageManager.getSize(Page)) that calculates in memory the actual page data size without the need to write it to a file for trial.
Furthermore, there's another method (org.pdfclown.tools.PageManager.split(long maxDataSize)) purposely implemented to address your kind of scenario which leverages the above-mentioned PageManager.getSize method: it automatically splits a file based on a size limit without creating any intermediate, ugly, stupid, temporary file for trial and error.
You can see a practical example of its use in the org.pdfclown.samples.cli.PageManagementSample (PageDataSizeCalculation and DocumentSplitOnMaximumFileSize cases) included in the downloadable distribution -- here it is an example of console output from the PageDataSizeCalculation case:
Page 1: 29380 (full); 29380 (differential); 29380 (incremental)
Page 2: 30493 (full); 1501 (differential); 30881 (incremental)
Page 3: 21888 (full); 1432 (differential); 32313 (incremental)
Page 4: 33781 (full); 4789 (differential); 37102 (incremental)
. . .
where:
full is the page data size encompassing all its dependencies (like shared resources) -- this is the size of the page when extracted as a single-page document;
differential is the additional page data size -- this is the extra content that's not shared with previous pages;
incremental is the data size of the page sublist encompassing all the previous pages and the current one.

how to get current page size (that is raw content) in mediawiki api?

I don't know where to put the size argument, here I only managed to get a single edit size:
https://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=size&format=xml&titles=United_States_of_America
but I need the size of whole raw content as it is on last revision.
You are getting that, the page United States of America contains the following 69 bytes:
#REDIRECT [[United States]]
{{Redr|move|from long name|printworthy}}
What this code means is that it's a redirect and the name of the real article is United States.
https://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=size&format=xml&titles=United_States
This returns the size you want: 267582 bytes.
Another option would be let the API follow the redirect automatically using redirects:
https://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=size&format=xml&titles=United_States_of_America&redirects
rvsize does lead to the output of the size of the whole revison.
In your example the size is really only 69 bytes, as you can see when you also read the content:
REDIRECT [[United States]]
{{Redr|move|from long name|printworthy}}
To automatically follow such redirects, use the redirects parameter for resolving redirects - in your case prop=revisions&rvprop=size&titles=United_States_of_America&redirectss which outputs a size of currently 267582 bytes.

MergerFacor effect on indexes

my solrconfig.xml configuration is as :
<mainIndex>
<useCompoundFile>false</useCompoundFile>
<ramBufferSizeMB>32</ramBufferSizeMB>
<mergeFactor>5</mergeFactor>
<maxMergeDocs>10</maxMergeDocs>
<maxFieldLength>10000</maxFieldLength>
<unlockOnStartup>false</unlockOnStartup>
</mainIndex>
and index size is 12mb. but when i change my mergeFactor i am not finding any effect in my indexes., ie. the no of segments are exactly same. i am not getting which configuration will effect the no of segments. as i suppose it is mergefactor.
and my next problem is which configuration defines the number of docs per segments and what will be the size of this segment so that next segments will be created
please make me clear about these points
To your questions:
MergeFactor: If you have a mergefactor of 10 .. every 10 documents there will be a new segment up to the number of 10segements than each segment is added to a segment of 100 and so on.
MaxMergeDocs give you the maximum number of documents a segment can take before starting to merge in a new segment.
So in the end both will have an influence on segments
Update:
If you use the dataImportHandler be sure that you dont auto-optimize to maxSegments=1 on a full import to see effects.