Only get DateTimeOriginal with exiftool - vb.net

Hey community,
Since a few days I'm stuck while trying to get the date of a .jpg or .png image file, when the picture was taken.
I believe it was called DateTimeOriginal.
What I'm trying to do, is getting just this one specific info, DateTimeOriginal, not more, not less.
This is part of a selfmade project, a program to sort pictures by the date when they were taken.
I'm programming with VB, and for the exif data I'm calling a batch file.
So i know how to use the exiftool. It's common use is:
exiftool file.jpg
But I need something like:
exiftool -DateTimeOriginal file.jpg >> DateTaken.txt
I have tried this one, but I'm not getting the Date, I only got a list of any jpg found in the directory, but without metadata.
I was searching so long for any option like this, but I can't find anything useful. Perhaps there is another, more efficient way to get metadata of an image, only using VB.
Has anyone an advise or other idea?
Thanks

You have the correct command to get the DateTimeOriginal tag from a file (exiftool -DateTimeOriginal file.jpg). But you say you are getting a list of filenames in a directory, which sounds like you're passing a directory name, not a file name. If you wish to get DateTimeOriginal for only those files in a directory that have a value in the tag, use exiftool -if "$DateTimeOriginal" -DateTimeOriginal C:/path/to/dir. Any file that doesn't have a DateTimeOriginal will not be listed then.
One thing to note is that the windows "Date Taken" property will be filled by a variety of metadata tags depending upon the filetype. For example, in PNG files, Windows will use PNG:CreationTime. In jpg files, Windows will use, in order, EXIF:DateTimeOriginal, IPTC:DateCreated + IPTC:TimeCreated, XMP:CreateDate, EXIF:CreateDate, and then XMP:DateTimeOriginal tags.

After a bit of digging, I found that you can get a list of properties if a bitmap.
Unfortunately the property IDs are numeric and rather cryptic.
Have a look here to find out more
After a bit more digging, it seems that the propertyId &h132 (a hexadecimal number) is the date stored as an array of integers in ascii encoding. This function finds propertyid &h132 and returns the date info as a string in year:month:date hour:minute:second format.
You might get variations with localization.. for example using /,: or - for the date separators etc, so, to parse it as a date type, you might need to work around that.
Public Function GetImageTakenDate(theimage As Bitmap) As String
Dim propItems As List(Of PropertyItem) = theimage.PropertyItems.ToList
Dim dt As PropertyItem = propItems.Find(Function(x) x.Id = &H132)
Dim datestring As String = ""
For Each ch As Integer In dt.Value
datestring += Chr(ch)
Next
datestring = datestring.Remove(datestring.Length - 1)
Return datestring
End Function

Related

Import txt file data and store it in Multidimensioanl array in Vb.net

Sorry if the problem is so basic, I'm a bit used to python not VB.net
I'm trying to read text file data (numbers) and store it in array/list
# Sample of text
1.30e+03,1.30e+03,1.30e+03
5.4600e+02,2.7700e+02,2.8000e+02
# PS: I can control the output of the numbers to have delimiter = ',' or space between numbers, whatever is easier to import
I wrote the following code to read string data and store it. yet, I don't know how to have a multidimensional array (2D or 3D) instead of 1D string (e.g. for the text above, it would be 2x3 array)
' Import Data
Comp_path = FinalPath & "Components_colors.txt"
reader = New StreamReader(Comp_path)
Dim W As String = ""
Dim wArray(10) As String
Dim i As Integer = 0
Do Until reader.Peek = -1
W = reader.ReadLine()
wArray(i) = W
i += 1
Loop
Moreover, I don't know the length of the text file, so I can't determine the length of the array like I did in the code above for the string wArray
For a file like this, you should turn to NuGet for a dedicated CSV parser. There are parsers built into .NET you could also use, but pulling one off of NuGet will also let you parse the values directly into something other than a string.
But if you really don't want to do that you can start with this (assuming Option Infer):
Public Function ImportData(filePath As String) As IEnumerable(Of Double())
Dim lines = File.ReadLines(filePath)
Return lines.Select(Function(line) line.Split(",").Select(AddressOf Double.Parse).ToArray())
End Function
And use it like this:
Comp_path = FinalPath & "Components_colors.txt"
Dim result = ImportData(Comp_path)
Note this code doesn't actually do any meaningful work yet. It doesn't even read the file. What it does is give you an object (result) that you can use with a For Each loop or linq operations. It will read the file in a just-in-time way, parsing out the data for each line as it goes. If you want an array (or List, which you should use in .Net more often), you can append a ToList() call to the end:
Comp_path = FinalPath & "Components_colors.txt"
Dim result = ImportData(Comp_path).ToList()
But you should try to avoid doing that it. It's much less efficient in terms of memory use. The first sample will only ever need to keep one line of the file in memory at a time. Adding ToArray() or ToList() needs to load the entire file.
Some more notes:
Many newer dynamic platforms like Python don't actually use real arrays in the formal computer science sense (fixed block of contiguous memory). Rather, they use collections, and just call them arrays. .Net has collections, too, but when you declare an array, you get an array. This has nice benefits for performance, but if you don't know you want that or how to take advantage of it you're probably better off asking for a generic List most of the time instead.
Thanks to cultural/internationalization issues, parsing numeric (or date) values to string and back again is much slower and more error-prone than you've believed in the past, especially coming from a dynamic platform. It is slow on these other platforms, too, but they want you to pretend it isn't. The first introduction to a strongly-typed platform like .Net can feel stifling in this area, but once you understand the performance and and reliability benefits, you won't want to go back.
In strongly-typed platforms it is very important to understand the data types you are working with at every level of an expression. Otherwise, building and reading statements like the Return line in my answer will be way more difficult and frustrating than it needs to be.

How do I get the right date format after using get text on the folder date?

I have tried everything that I can think of to get the right date format. Can anybody help with this RPA-problem in UiPath. I have used the 'get text' activity to get the folder date and after that tried to use the
Datetime.ParseExact(Str variable,"dd/MM/yyyy", System.Globalization.CultureInfo.InvariantCulture).
It gives me the error:
Assign: String was not recognized as a valid DateTime.
Your help is much appreciated.
edit: I have now sought help from a friend who figured it out. The error was in the string, which could not be seen, because there was an invisible character in the string that I extracted through 'get text' activity. The solution is in 2 steps:
assign another variable to the exact same date and use an if statement to find out if these two strings are equal and you will find out that they are not.
Now use a regex to capture only digits and slash/hyphen which will get rid of the invisible character.
Try to use "dd_MM_yyyy" instead of "dd/MM/yyyy".
The reason is because UiPath/VB.Net has a habit of using US date formatting even though you're using Culture Info...It's a real pain
Try this:
pDateString = "21/02/2020"
Assign a Date type variable = Date.ParseExact(pDateString,"dd/MM/yyyy",nothing)
Here we're telling the parser to use English format date...The Date type returned will be in US format but you can simply convert back to uk IF needed by using something like:
pDateString("dd/MM/yyyy")

Convert Structure in SerializationBinder

Is there something special that needs to be done in order to convert a structure within a SerializationBinder?
Refer to my original question and "answer" to that: Type.GetType returns Nothing in SerializationBinder
The first time it comes to a list of a structure, I get:
Object of type 'System.Runtime.Serialization.TypeLoadExceptionHolder'
cannot be converted...
Well, I'm blind...it turns out the issue was that I was missing a couple brackets at the end in each of the statements to convert a list of something.
Ex: Change:
typeName = String.Format("System.Collections.Generic.List`1[[[my project].[type]], {0}", Assembly.GetExecutingAssembly().FullName)
To:
typeName = String.Format("System.Collections.Generic.List`1[[[my project].[type]], {0}]]", Assembly.GetExecutingAssembly().FullName)
WHY I didn't get an error until after it tried to convert a list of a structure specifically and everything before that seemed to work, I have NO IDEA.

Split and find specific text?

ok so i've made a HTTPWEBREQUEST and i've made the source of the result show in a richtextbox, Now say i have this in the richtextbox
<p>Short URL: <code>http://URL.me/u/eywnp</code></p>
How would i go about just getting the "http://URL.me/u/eywnp" ive tried split but didnt work, guess i'm doing it wrong?
NOTE the URL will be different everytime
Split isn’t the right tool for the job. It will result in a rather complex piece of code that’s quite brittle (meaning it will break as soon as there’s the slightest change in the input).
For a robust, well-written solution you need to parse the HTML properly. Luckily there exist canned solutions for that: The HtmlAgilityPack library.
Dim doc As New HtmlDocument()
doc.LoadHtml(yourCode)
Dim result = doc.DocumentElement.SelectNodes("//a[#href]")(0)("href")
The only complicated part here is the string "//a[#href]". This is an XPath string. XPath strings are a mini-language that is used to address elements in an HTML or XML document. They are conceptually similar to file paths (like C:\Users\foo\Documents\file.txt) but with a slightly different syntax.
The XPath simply selects all the <a> elements having a href attribute from your document. Then you can grab the first of that collection and retrieve the href attribute’s value.
Thanks for all your help, i did find a solution and i used
Dim iStartIndex, iEndIndex As Integer
With RichTextBox1.Text
iStartIndex = .IndexOf("<p>Short URL: <code><a href=") + 29
iEndIndex = .IndexOf(""">", iStartIndex)
Clipboard.SetText(.Substring(iStartIndex, iEndIndex - iStartIndex))
End With
works perfect so far

Getting the RIGHT word count of a PDF file

The response in this topic helped me understand why sometimes my
PDF fails to find a word and why I keep getting different word counts when using
different PDF word count programs. I decided to use xpdf. I converted it to text
and added the -layout tag and then opened the resulting text file with Word 2003.
I noted the word count. Then I decided, unfortunately, to remove the -layout tag.
This time, though, the word count is different.
Why did that tag affect the word count? Is there an accurate way to find the word count
of a PDF file? I would even pay for such software if I have to so long as it gives me
the right number of words.
(I checked another topic but thought I'd find out if the solution I just offered would solve everything. There was another topic where advancedpdf was recommended.)
I'd like to argue that there is no reliable word counting. One could, for example, just to make your life harder, put each character of this lovely Stackoverflow answer into a single text object and position such objects such that, only when rendered, gives a meaningful paragraph to humans. Like this:
<html><body><style>
div {float: left;}
</style><div><p>S</p></div><div><p>t</p></div><div><p>a</p></div>
<div><p>c</p></div><div><p>k</p></div>
I would suggest an open source solution using Java. First you would have to parse the pdf file and extract all the text using Tika.
Then i believe you can achieve this simply by scanning the extracted text and counting the words.
Sample code would look like this:
if (f.getName().endsWith(".txt"))
{
in = new BufferedReader(new FileReader(f));
StringBuilder sb = new StringBuilder();
String s = null;
while ((s = in.readLine()) != null)
sb.append(s);
String[] tokenizedTerms = sb.toString().replaceAll("[\\W&&[^\\s]]", "").split("\\W+"); //to get individual terms
}
In tokenizedTerms array , you wil have all the terms(words) of the document and you can count them by calling tokenizedTerms.length(). Hope this was useful. :-)