Query repository file contents from GitLab - sql

I want retrieve the commit id of a file readmeTest.txt through Invantive SQL like so:
select * from repository_files(29, file-path, 'master')
But for this to work I need a project-id, file-path and a ref.
I know my project-id (I got it from select * from projects) and my ref (master branch) but I don’t know where I can find the path to the file I want to retrieve information of.
So where can I find the value of file-path and ref?
This is my repository directory tree, where I can see the files exist:

You need to join several entities in GitLab to get the information you need.
The fields from your repository_files table function and their meaning:
project-id can be found as id in the projects entity, as you already knew;
ref-name can be found as name in repositories;
ref is the name of a branch, a tag or a commit, so let's assume you want the master for now.
Giving this information, you need the below query to get all repository files and their content in a project (I narrowed it down to a single project for now):
select pjt.name project_name
, rpe.name repository_name
, rpf.content file
from projects pjt
join repositories(pjt.id) rpe
on 1=1
and rpe.name like '%.%'
join repository_files(pjt.id, rpe.name, 'master') rpf
on 1=1
where pjt.id = 1

Related

How to list duplicates based on different criteria's T-SQL

I'm looking for someone to help me with a very specific task I have.
I'm analysing data from computer hard drives and need to be able to list folders which are duplicated after being extracted from .zip files. Here is an example of the data I am working with:
ItemName
Extension
ItemType
MyZipFolder.zip
.zip
File
MyZipFolder
null
Folder
PersonalDocuments.zip
.zip
File
PersonalDocuments
null
Folder
As you can see the extension '.zip' is included in the 'ItemName' and 'Extension' column. When extracted from a .zip file, it becomes a folder. I need a way of listing either the .zip file or the folder which it becomes after extraction (either will do, it just needs to be listed with the knowledge that it is a duplicate).
The caveat to this is that my data consists of plenty other folders and files with different extensions e.g. '.docx', '.msg' so the query needs to discount these.
I hope this makes sense - thanks!
Expected output might look something like this:
ItemName
Extension
ItemType
MyZipFolder
null
Folder
PersonalDocuments
null
Folder
So a list of all the folders which I know have a .zip equivalent in the data.
Not sure yet, but do you mean something like this?
select *
from your_table y
where ItemType = 'Folder'
and exists (
select 1 from your_table yy
where yy.Extension = '.zip'
and yy.ItemName = y.ItemName + '.zip'
)
I think I got what you need :
select ItemName
from tablename
group by replace(ItemName, isnull(Extension,''))
having case count(case when Extension = '.zip' then 1 end) > 1

Documentum DQL Getting list subfolders from a few folders

everyone!
I am starting with Documentum this month and here is my problem.
There is multi-select drop-down list with cabinets. I have to choose some of them to get list of inner folders in my result list below.
This query:
select * from dm_folder where folder(id($param$))
or
select * from dm_folder where folder($param$)
where param is object_name
works with single drop-down select.
I've tried to insert "in"
select * from dm_folder where folder in($param$)
results with
[DM_QUERY_E_SYNTAX]error: "A Parser Error (syntax error) has occurred in the vicinity of: select * from dm_folder where folder in"
or
select * from dm_folder where folder(id in($param$))
results with
select * from dm_folder where folder(id in ('0c0511d48000105',' 0c0511d48000106'))
[DM_QUERY_E_SYNTAX]error: "A Parser Error (syntax error) has occurred in the vicinity of: select * from dm_folder where folder(id in"
and return multivalue flag in queries above but it doesn't work.
Can someone help, please? Thank you!
Try with this one:
SELECT * FROM dm_folder WHERE r_object_id IN ('0c0511d48000105',' 0c0511d48000106')
I think it's clear where you made an error. Keyword FOLDER you used is function that accepts single value as parameter.
From DQL guide:
The FOLDER predicate
The FOLDER predicate identifies what folders to search. The syntax is:
[NOT] FOLDER(folder_expression {,folder_expression} [,DESCEND])
The folder_expression argument identifies a folder in the current repository. You cannot search a
remote folder (a folder that does not reside in the current repository). Valid values are:
• An ID function
• The ID function (described in The ID function, page 29) identifies a particular folder.
• A folder path
A folder path has the format:
/cabinet_name{/folder_name}
Enclose the path in single quotes. Because cabinets are a subtype of folder, you can specify a
cabinet as the folder.
• The keyword DEFAULT
The keyword DEFAULT directs the server to search the user’s default folder. Note that a user’s
default folder is the same as the user’s default cabinet (because cabinets are a subtype of folders).
Edit 1:
SELECT * FROM dm_folder WHERE ANY i_folder_id IN ('0c0511d48000105',' 0c0511d48000106')
With this query you are looking for folder type objects whose parent is any of the folders specified as the parameter.
Edit 1:
SELECT * FROM dm_folder WHERE i_cabinet_id IN (<list of ids>)
This will return you all folder objects under cabinet

Error: Not found: Dataset my-project-name:domain_public was not found in location US

I need to make a query for a dataset provided by a public project. I created my own project and added their dataset to my project. There is a table named: domain_public. When I make query to this table I get this error:
Query Failed
Error: Not found: Dataset my-project-name:domain_public was not found in location US
Job ID: my-project-name:US.bquijob_xxxx
I am from non-US country. What is the issue and how to fix it please?
EDIT 1:
I change the processing location to asia-northeast1 (I am based in Singapore) but the same error:
Error: Not found: Dataset censys-my-projectname:domain_public was not found in location asia-northeast1
Here is a view of my project and the public project censys-io:
Please advise.
EDIT 2:
The query I used to type is based on censys tutorial is:
#standardsql
SELECT domain, alexa_rank
FROM domain_public.current
WHERE p443.https.tls.cipher_suite = 'some_cipher_suite_goes_here';
When I changed the FROM clause to:
FROM `censys-io.domain_public.current`
And the last line to:
WHERE p443.https.tls.cipher_suite.name = 'some_cipher_suite_goes_here';
It worked. Shall I understand that I should always include the projectname.dataset.table (if I'm using the correct terms) and point the typo the Censys? Or is this special case to this project for some reason?
BigQuery can't find your data
How to fix it
Make sure your FROM location contains 3 parts
A project (e.g. bigquery-public-data)
A database (e.g. hacker_news)
A table (e.g. stories)
Like so
`bigquery-public-data.hacker_news.stories`
*note the backticks
Examples
Wrong
SELECT *
FROM `stories`
Wrong
SELECT *
FROM `hacker_news.stories`
Correct
SELECT *
FROM `bigquery-public-data.hacker_news.stories`
In Web UI - click Show Options button and than select your location for "Processing Location"!
Specify the location in which the query will execute. Queries that run in a specific location may only reference data in that location. For data in US/EU, you may choose Unspecified to run the query in the location where the data resides. For data in other locations, you must specify the query location explicitly.
Update
As it stated above - Queries that run in a specific location may only reference data in that location
Assuming that censys-io.domain_public dataset has its data in US - you need to specify US for Processing Location
The problem turned out to be due to wrong table name in the FROM clause.
The right FROM clause should be:
FROM `censys-io.domain_public.current`
While I was typing:
FROM domain_public.current
So the project name is required in the FROM and `` are required because of - in the project name.
Make sure your FROM location contains 3 parts as #stevec mentioned
A project (e.g. bigquery-public-data)
A database (e.g. hacker_news)
A table (e.g. stories)
But in my case, I was using the LegacySql within the Google script editor, so in that case you need to state that to false, for example:
var projectId = 'xxxxxxx';
var request = {
query: 'select * from project.database.table',
useLegacySql: false
};
var queryResults = BigQuery.Jobs.query(request, projectId);
check exact case [upper or lower] and spelling of table or view name.
copy it from table definition and your problem will be solved.
i was using FPL009_Year_Categorization instead of FPL009_Year_categorization
using c as C and getting the error "not found in location asia-south1"
I copied with exact case and problem is resolved.
On your Big Query console, go to the Data Explorer on the left pane, click the small three dots, then select query option from the list. This step confirms you choose the correct project and dataset. Then you can edit the query on the query pane on the right.
may be dataset name changed in create dataset option. it should be US or default location
enter image description here

Doctrine - Get items based on the count of a many to many

I have the following tables, User, Project and Images.
A User has a one to many relationship with Projects and Images. Each Project and Image is owned by a User.
A Project has a many to many relationship with Images. So each Project can have many Images, and an Image can appear within many Projects.
I want to write a DQL query to get all Images, for a specific User that are not included in any Projects. This I can write in SQL.
Is writing in SQL the best way to go?
Or should I be using DQL to do this?
I have tried writing the DQL but its hard work!
Edit
From within my Image Repo I am now doing this
$qb = $this->createQueryBuilder("i");
$qb->select("i")
->from('MyBundleName:User','u')
->innerJoin('u.images', 'user_images')
->where('IDENTITY(u.id) = :user_id')
->andWhere('NOT EXISTS (
SELECT p
FROM MyBundleName:Project p
WHERE user_images MEMBER of p.images
)')
->setParameter('user_id', $user_id);
I have replace the / syntax with : for my classes as they failed when using /
I am still getting this error though;
[Semantical Error] line 0, col 131 near 'id) = :user_id': Error:
Invalid PathExpression. Must be a SingleValuedAssociationField.
The function createQueryBuilder requires an alias, I am passing it "i" - is that right?
I then give it "i" again when calling select?
If I remove the innerJoin then it works, but the results are wrong. It returns all images even if they do exist within an project.
I can't say how difficult it would be in DQL, but I can tell you that in SQL it sounds pretty simple:
SELECT I.*
FROM IMAGES I
INNER JOIN USERS U ON (I.USER_ID = U.USER_ID)
WHERE NOT EXISTS (
SELECT *
FROM PROJECTS P, PROJECT_IMAGES PI
WHERE P.USER_ID = U.USER_ID
AND PI.PROJECT_ID = P.PROJECT_ID
AND I.IMAGE_ID = PI.IMAGE_ID
)
Images owned by a user which does not exist in any project that the user owns.
I don't know what your entities look like but given the relations something like this should work. The key is combining NOT EXISTS and MEMBER OF (you want to make sure that for all returned images no project exists that the image is a member of).
$qb->select("i")
->from('MyBundle\Entity\User','u')
->innerJoin('u.images','i')
->where('IDENTITY(u) = :user_id')
->andWhere('NOT EXISTS (
SELECT p
FROM MyBundle\Entity\Project p
WHERE i MEMBER of p.images
)')
->setParameter('user_id', $user_id);

Find the tfs path of merged branch

Using TFS, I have trunk $/project/trunk and a branch $/project/dev/feature/new_one.
I have merged my branch back to trunk as follows:
C33($/project/trunk)
| \
| \
| C32($/project/dev/feature/new_one)
| |
| |
| |
...
I use the TFS API and can find the merge changeset C33. With the method QueryMerges(), I'm able to find the parent changeset C32 with all the changes on the files, but not the information I need :(
Is there a way, using the TFS API, to find the repository path of the branch merged $/project/dev/feature/new_one?
With the changeset C32, I'm only able to get paths of modified files, like $/project/dev/feature/new_one/path/to/file.txt but I'm unable to extract the path of the branch from the full path of the file :(
PS : A solution working since TFS2008 will be the best, but if it works only since 2010, it should be good...
PS2 : solving this problem will help to manage merge changesets in git-tfs which I develop...
Unfortunately there is no API method to get a branch for a given item path, which you would think is a fairly common use case.
TFS 2010 onwards you can use VersionControlServer.QueryRootBranchObjects to query all branches in version control. Using RecursionType.Full as the parameter to this method you will get a BranchObject array of all branches with no parent and all of their descendents. You can then determine a branch for a given file path as follows:
var collection = new TfsTeamProjectCollection(new Uri("http://tfsuri"));
var versionControl = collection.GetService<VersionControlServer>();
var branchObjects = versionControl.QueryRootBranchObjects(RecursionType.Full);
var mergeFilePath = "$/project/dev/feature/new_one/path/to/file.txt";
var branch = branchObjects.SingleOrDefault(b => {
var branchPath = b.Properties.RootItem.Item;
return mergeFilePath.StartsWith(branchPath.EndsWith("/") ? branchPath : branchPath + "/");
});
Console.WriteLine(branch.Properties.RootItem.Item);
As shown, the path to the branch is at BranchObject.Properties.RootItem.Item. I believe it is safe to find the relevant BranchObject in the array simply by checking which branch's path is contained in the merge file's path (given it is only possible match at most one branch as TFS enforces that only one branch can exist in a given folder hierarchy).
Just to be aware, I have been burned by this Connect issue when using QueryRootBranchObjects in TFS 2012. The cause were some spurious branches that had apostrophes in the branch name.
The workaround to this is to use VersionControlServer.QueryBranchObjects, however this takes an item identifier which is the exact path to the branch. Clearly you don't know the branch path at this point as all you have is a file path, so you have to recurse up the directories of the file path calling QueryBranchObjects each time until you get a match.