How to get the default image of Wikipedia article? - wikipedia-api

Is it possible to get the (default) image associated with a Wikipedia article with an API call if only the URL of the article is known?
Is it possible to make constraints about the image size / resolution in the call (as an image is usually available in different resolutions)?
Is it possible to request the largest image version not larger than MAX_X px?
An example:
https://upload.wikimedia.org/wikipedia/commons/e/e9/Jaguar_%28Panthera_onca_palustris%29_female_Piquiri_River_2.JPG
is 4769x3179 large (1,5 ratio).
A request limiting the MAX_X size to 3000 would result in an image scaled to 3000x2000.
A request limiting the MAX_X size to 5000 would result in the image of the original size (4769x3179) because MAX_X is bigger than the image's x size.

Use MediaWiki API with pageimages. For example for Wikipedia article Jaguar and requested max image size 500 the query will be:
https://en.wikipedia.org/w/api.php?action=query&prop=pageimages&titles=Jaguar&pithumbsize=500
From the response you can get also the largest image version, just remove the thumb part and everything from the pixels to the end of the link.

Related

Simulate Camera in Numpy

I have the task to simulate a camera with a full well capacity of 10.000 Photons per sensor element
in numpy. My first Idea was to do it like that:
camera = np.random.normal(0.0,1/10000,np.shape(img))
Imgwithnoise= img+camera
but it hardly shows an effect.
Has someone an idea how to do it?
From what I interpret from your question, if each physical pixel of the sensor has a 10,000 photon limit, this points to the brightest a digital pixel can be on your image. Similarly, 0 incident photons make the darkest pixels of the image.
You have to create a map from the physical sensor to the digital image. For the sake of simplicity, let's say we work with a grayscale image.
Your first task is to fix the colour bit-depth of the image. That is to say, is your image an 8-bit colour image? (Which usually is the case) If so, the brightest pixel has a brightness value = 255 (= 28 - 1, for 8 bits.) The darkest pixel is always chosen to have a value 0.
So you'd have to map from the range 0 --> 10,000 (sensor) to 0 --> 255 (image). The most natural idea would be to do a linear map (i.e. every pixel of the image is obtained by the same multiplicative factor from every pixel of the sensor), but to correctly interpret (according to the human eye) the brightness produced by n incident photons, often different transfer functions are used.
A transfer function in a simplified version is just a mathematical function doing this map - logarithmic TFs are quite common.
Also, since it seems like you're generating noise, it is unwise and conceptually wrong to add camera itself to the image img. What you should do, is fix a noise threshold first - this can correspond to the maximum number of photons that can affect a pixel reading as the maximum noise value. Then you generate random numbers (according to some distribution, if so required) in the range 0 --> noise_threshold. Finally, you use the map created earlier to add this noise to the image array.
Hope this helps and is in tune with what you wish to do. Cheers!

How to get DPI of a PDF file?

Using ImageMagick or GhostScript or any PHP code how can I get the DPI value of PDF files?
Here is the link for two demo files
http://jmp.sh/O5g5wL4 -- of 72 DPI
http://jmp.sh/RxrnYrY -- of 300 DPI
I have used
$image = new Imagick();
$image->readImage('xyz.pdf');
$resolutions = $image->getImageResolution();
It gives the same result for two different PDF files having different DPI.
I have also used
pdfimages -list xyz.pdf
It gives a list of all information but how to fetch the DPI value from the list.
How to get the exact DPI value of a PDF?
As fmw42 says PDF files themselves have no resolution. However in your case both the files consist of nothing but an image. In one case the image is ~48 MB and in the other its around 200 MB.
The reason is that the images have a different effective resolution.
In PDF the image is simply a bitmap, a sequence of coloured pixels. These are then drawn onto the underlying media. At this point there is no resolution, the pixels are laid down in a specific media size. In your case 22 inches by 82 inches.
The effective resolution is given by dividing the dimension by the number of pixels in the image in that dimension.
So if I have an image which is 1000x1000 pixels, and I draw it in a 1 inch square, then the effective resolution of the image is 1000 dpi. If I change my mind and draw it in a square 4 inches by 4 inches, then the effective resolution is 250 dpi.
The image hasn't changed, just the area it covers.
Now consider I have two images drawn in 1 inch squares. the first image is 1000x1000, the second is 500x500. The effective resolution of the first image is 1000 dpi, the effective resolution of the second is 500 dpi.
So you can see that, in PDF, the effective resolution of the image is a combination of the dimensions of the image, and the dimensions of the media it covers.
That's a difficult thing to measure in a PDF file. The area covered is calculated using matrix algebra and can be a combination of several different matrices.
The actual dimensions of the image, by contrast are quite easy to determine, they are given in the image dictionary. Your images are: 1620x5868 and 3372x12225. In both cases the media is the same size; 22.5x81.5 inches.
Since the images cover the entire media, the effective resolutions are;
1620/22.5 = 72 by 5868/81.5 = 72
3372/22.5 = 149.866 by 12225/81.5 = 150
I think MuPDF will give you image dimensions and media dimensions, assuming all your PDF files are constructed like this you can then simply perform the maths, but note that this won't be so simple for ordinary PDF files where images don't cover the entire media.
Using mutool info -I -M 150-dpi.pdf gives:
Retrieving info from pages 1-1...
Mediaboxes (1):
1 (6 0 R): [ 0 0 1620 5868 ]
Images (1):
1 (6 0 R): [ DCT ] 3375x12225 8bpc DevCMYK (12 0 R)
So there's your image dimensions and your media size. All you need to do is apply the division of one by the other.
Note: In debian and related distros, mutool is contained in mupdf-tools package, not in mupdf package itself. It can by therefore installed by sudo apt install mupdf-tools.
I use pdfimages -list from the poppler library, gives you all the information about the images.

Identify image depth and print size via Carrierwave, mini-magick and ImageMagick

The goal is to validate an image based on dynamic height and width parameters, as well as DPI.
ImageMagick has the following command Identify which has a number of options.
-density
will generate the geometry widthxheight
-verbose
will generate a helpful "Print size: " and "Resolution"... among 78 other different lines... where width and height need to be parsed out to meet minimum requirements +/- 2%
so how does one extract those into a method, without stepping on intermediate toes (mini-magick)?
As the section on meta-information indicates, MiniMagick accesses the data using ImageMagick function in a single call, for example height density:
image["%y"]
ImageMagick has 47 single-letter attribute percent escapes that allows extraction of the data, provided your call to the image contains a ".path" suffix
image = MiniMagick::Image.open(#yourClass.theColumn.path)

ImageMagick losing aspect ratio

Very simply I have a script that calls imagemagick on my photos.
The original image is 320 x 444, and I want to create a few scaled down versions but keeping the same aspect ratio
I call imagemagick using
convert oldfile.png -resize 290x newfile.png
I want to scale it to my set widths but the heights scale accordingly.
I do 80x, 160x and 290x in 3 separate commands.
The smallest 2 produce images with the same original aspect ratio, the 290x does not.
The size of the image it produces is 290 x 402
I have no idea why that one fails to keep the aspect ratio but the other 2 sizes maintaine it.
Any ideas?
I think that the problem in the third command is the requested size itself:
in the first command both dimensions are divided by 4: 320/4=80 and 444/4=111
in the second command both dimensions are divided by 2: 320/2=160 and 444/2=222
444 and 320 have only two common divisors: 2 and 4. You already used these divisors in your first two commands, so any other (width, height) couple will give you a slightly different aspect ratio: it is impossible to obtain the same exact aspect ratio fixing 290.
In fact while your original image has an aspect ratio of 1.3875, with a 290x403 image you would obtain an aspect ratio of 1,389655172 and with a 290x402 image you would get a 1,386206897 ratio: fixing 290 there is no other dimension's value that can give you the desired aspect ratio.
In general however Imagemagick always tries to preserve the aspect ratio of the image, as you can read in Imagemagick documentation:
The argument to the resize operator is the area into which the image
should be fitted. This area is not the final size of the image but the
maximum sizes for the image. that is because IM tries to preserve the
aspect ratio of the image more than the final (unless a '!' flag is
given), but at least one (if not both) of the final dimensions should
match the argument given image. So let me be clear... Resize will fit
the image into the requested size. It does NOT fill, the requested box
size.
For further reference see here

Search for images in Flickr with size not smaller than x by y pixels

I have read Flickr API documentation and didn't find dimensions criteria for search so I suppose it's not present. Though there is getSizes method that lets me get available sizes for some specific image by its ID.
Assume I want to find images that are not smaller than 1920x1200 pixels. How do I better do it?
Thy this (found in address bar while searching at flickr)
//...
'dimension_search_mode': 'min',
'height': 1024,
'width': 1024