osquery - How can I retrieve a file origin using osquery? - sql

I'm using osquery on Windows and I need help: I want to retrieve the file origin of a specific file. For example I download a file from http://example.com and I'm looking for a query on osquery that show me the info that I download that specific file from http://example.com (or something like this). I thought that to derive this information I can compare the timestamps between the table file and the table routes but there isn't the column timestamp in routes. How can I do that?

I don't see a table for this on windows, although the information is available on the system through ADS(see this answer). I would open an issue for this on the osquery repo, it would be a valuable table to have.
You can use the extended_attributes table. For example:
osquery> select path, key, value, base64 from extended_attributes where path ='/Users/victor/Downloads/osqueryi.zip';
path = /Users/victor/Downloads/osqueryi.zip
key = com.apple.lastuseddate#PS
value = eynzWgAAAAAbZEQgAAAAAA==
base64 = 1
path = /Users/victor/Downloads/osqueryi.zip
key = where_from
value = https://files.slack.com/files-pri/T04QVKUQG-FALAL3WP2/download/osqueryi.zip
base64 = 0
osquery>

+1 on what #groob mentioned, this'd be a nice table to have and I think we've wanted it for some time. I thought we already had an issue cut for this, but I went ahead and made a new one as simple searches wasn't turning anything up. Thanks for the question :)
https://github.com/facebook/osquery/issues/5250

Related

Azure Data Factory check file name dynamically

I'm checking daily if certain files exist in a folder on-prem. The files have a specific format, but the first few letters indicate specific job. For example, xyz-yyyyMMdd.csv, or abc-yyMMdd.csv etc
I would like to use switch activity to see if the file for each job has arrived or an alert should be used. How can I dynamically let the switch activity read the 'xyz' portion knowing that the other part of the file name is dynamic?
Thank you
If number of your few letters is three as you said, you can try this expression:
#substring(item().name,0,3)
If no, you can try this:
#split(item().name,'-')[0]
Here is my test:

Which is the best practice either to save image name or full URL in database

Which is the better approach for storing image name in database? I have two choices first one is to store just image name e.g. apple.png and second choice is to store full image URL e.g. abc.com/src/apple.png.
Any help will be appreciated. Thanks.
Best practice is not save full path to image like abc.com/src/apple.png but saving specific domain path to image. Ex:
Users image : /user/{id}/avatar/img.png
Product image: /product/{id}/1.png
In this case you avoid sticking images to defined server, server path, url etc. For example, you will decide to move all your images to another server, in this case you don't need to change all records in DB.
The 2 answers already covered it pretty well. It is indeed best practice to save the directory path instead of saving the entire URL path. Some of the reasons were already covered, such as making it easy to move your folders to another server without having to make any changes whatsoever in your file logic.
What you could do, is also have everything in one directory, refer to that, and then just save the image name. However, I would not recommend that. The other structure simply makes it way easier to navigate and look through. Good file structure is something you'll thank yourself for later in case you ever have to go through things manually for one reason or another.
With that said, I'd like to add this trick into the mix:
$_SERVER['DOCUMENT_ROOT']. This always makes you start from the root directory as opposed to having to do tedious things, such as ../../ etc. It looks like a mess.
So in the end as an image path, you'd have something like:
<img src="<?php echo $_SERVER['DOCUMENT_ROOT'].'/'.$row['filePath']; ?>" >
$row['filePath'] being your stored filepath from the database.
Depending on how your file path is saved, you can lose the / in the image source link.
first of all you need to upload all images in public folder of your project , so no need to save domain name
If you are storing all images in one directory , then there is no problem storing only imagename in database
you can easily access images like <img src="/foldername/imagename.jpg" />
but if in your project there are multiple directory like
profile :to save user avatar image ,
background : to save background images,
then it is better to save image with path in database like "/profile/avatar.jpg"
so you can access image like <img src="imagepathhere" />
Another common way is to create image table with cols
id
type (enum or int)
name (file name)
Define in your app (better in model) types like
USER_AVATAR = 1;
PRODUCT_IMG = 2;
Define path map foreach image type like:
$paths = [
USER_AVATAR => '/var/www/project/web/images/users',
...
];
and use id's from this image table in another tables. It is called polymorphic association. It is most flexible way to store images.

Is it possible to rename png files by the time of creation, not after

I am using File[] imageFile = PdfUtilities.convertPdf2Png(new File("MYPATH")) command to generate png from pdf , which is giving file name as "workingimage01","workingimage02"...."workingimage0n" and so on, is it possible to change this name setting by the time png's are generated. Thanks in advance.
I am trying this command for 10 pdf parallel, so it is overlapping. thats why i need to know is there a way or i am asking out of the box question.
The name is hard coded in the source. You can create an overloaded method that takes another parameter for the working files' name.

What is the best way to retrieve an ID from a file name?

Scenario:
Our customer is has provided us with files whose names contain an ID number that we need for indexing purposes.
.\root\dir1\a123.txt (ID is 123)
.\root\dir2\abc345.csv (ID is 345)
.\root\dir3\235.xls (ID is 235)
we know what format to expect based on the files location and extension. Our customer would like to be able to add
.\root\dir4\foo556.bar (ID is 556)
meaning we cannot write a custom method for each entry under root.
My Solution:
The solution we are thinking of is to store the formats of the file names in an XML file
<root>
<entry>
...
<format>abc###</format>
...
<entry>
<root>
when the customer want to add a new entry under root they'll have to give a directory, a file extension and a format. Then on our end implement a getID() method that is able to use the format specified in the XML to retrieve the IDs from the file name.
Question:
Has anyone else dealt with a similar situation? If so is there a better solution than the one I have provided?
Assuming the file name will always be on the form <letters><digits>.<extension>, I would use a simple regular expression to match the relevant part of the name. E.g. .*\\[a-z]*\([0-9]*\)\..* (may vary depending on the RE engine in question).
If you want a generic solution which would automatically identify all files that match, Yyou could use file globs in the shell if they are available and work for your particular case:
something like:
ls root/*/ | sed 's/^(.*)([0-9])+(.[A-Za-z][A-Za-z][A-Za-z]+)$/"\1\2\3" \2/' | xargs -n2 runMyProgramHere
if you need to do it programatically, normally directory inquiries are fairly easy in most languages, list everything in /root, of those, list everything, filter by files ending in +.+, there's your list.
in psuedo-code:
for (directory in file.getDirectoryList("/root")) {
for (name in file.getDirectoryList("/root/" + directory)) {
if (name contains a sequence of numbers followed by a dot ending with an extension) {
extract id
store filename and id
}
}
}
you can probably do this with regexes if you really want, but I tend to avoid regexes in programs unless I have a really good reason not to. They are often poorly understood and prone to breaking without good error reporting.

Preventing YQL from URL encoding a key

I am wondering if it is possible to prevent YQL from URL encoding a key for a datatable?
Example:
The current guardian API works with IDs like this:
item_id = "environment/2010/oct/29/biodiversity-talks-ministers-nagoya-strategy"
The problem with these IDs is that they contain slashes (/) and these characters should not be URL encoded in the API call but instead stay as they are.
So If I now have this query
SELECT * FROM guardian.content.item WHERE item_id='environment/2010/oct/29/biodiversity-talks-ministers-nagoya-strategy'
while using the following url defintion in my datatable
<url>http://content.guardianapis.com/{item_id}</url>
then this results in this API call
http://content.guardianapis.com/environment%2F2010%2Foct%2F29%2Fbiodiversity-talks-ministers-nagoya-strategy?format=xml&order-by=newest&show-fields=all
Instead the guardian API expects the call to look like this:
http://content.guardianapis.com/environment/2010/oct/29/biodiversity-talks-ministers-nagoya-strategy?format=xml&order-by=newest&show-fields=all
So the problem is really just that the / characters gets encoded as %2F which I don't want to happen in this case.
Any ideas on how this can be achieved?
You can also check the full datatable I am using:
http://github.com/spier/yql-tables/blob/master/guardian/guardian.content.item.xml
The URI-template expansions in YQL (e.g. {item_id}) only follow the version 3 spec. With version 4 it would be possible to simply (only slightly) change the expansion to do what you want, but alas not currently with YQL.
So, a solution. You could bring a very, very basic <execute> block into play: one which adds the item_id value to the path as needed.
<execute><![CDATA[
response.object = request.path(item_id).get().response;
]]></execute>
Finally, see the diff against your table (with a few other, minor tweaks to allow the above to work).