Reference Items in context - msbuild

I am trying to figure out a way to access the Items from the context without having to declare an ItemGroup explicitly.
Currently trying for the Copy task:
<Copy SourceFiles="C:\blabla\**\*.*" DestinationFiles="%(?.RecursiveDir)" />
What can I use in place of "?" to select the Items in context ?
The reason is that, I have an MSBuild project file being generated via XSLT and there are unknown number of folders & files (some of them follow a different structure under the destination folder - in that case I intend to use different meta data in place of RecursiveDir) in the input XML. Is it possible to achieve this without the need to declare loads of Itemgroups (or an Itemgroup with lots of Items) ?
I tried searching for this, but all I found were posts with Itemgroups declared.

#Alexey Shcherbak wrote:
You want to refer item metadata without explicitly declaring item itself, so I have doubts you will be able to do this. Also Copy task require that Source Files should be ITaskItem[] type (literally - it require item collection). Actually the msdn description of copy task has an exact example you could follow, but you should declare itemgroup with nested items clause inside.
You may wonder if you have an item with a lot of files, does it make MSBuild slow down.
The answer is: it depends =). What numbers you mean under huge fileset =). It's true that MSBuild engine emits and evaluates each item group in memory, and probably a huge fileset could lead to a bigger memory footprint. But MSBuild is not adapted to work as your scripting language of choice (even powershell has issues with 250K+ files in one dir, windows itself also). If you just need perform a copy without accessing to full meta (except recursive dir)- use Exec task and invoke robocopy.exe - it works waaaay better than anything else (considering available out of the box tools).
As an addition - huge numbers should be tested and evaluated before we declare that concrete tool isn't acceptable for that. I think as soon as MSBuild could deal with big solutions - it could probably deal with pretty big filesets. It just the resource/speed question. But any tool also have it's unsurmountable limits.
Actually I meant not robocopy extension, but robocopy.exe itself en.wikipedia.org/wiki/Robocopy, you can easily call it with Exec task. And surely hardlinks are unbeatable in terms of "copy" speed =). But keep in mind - it will work only across single disk volume (because it's not actual copy, it just adding another file name to same set of bytes =) ). In case you need actual copy to another drive or over the network - robocopy will shine again =).
PS: 20k files are far from my definition of huge ;) We dealt with ~280k-300k small files, summary volume around 80Gb. Powershell for plumbing and robocopy for actual bits-moving won that round.

Related

Is is possible to pass a variable from the build process to Visual Basic code?

My goal is to create build definitions within Visual Studio Team Services for both test and production environments. I need to update 2 variables in my code which determine which database and which blob storage the environment uses. Up till now, I've juggled this value in a Resource variable, and pulled that value in code from My.Resources.DB for a library, and Microsoft.Azure.CloudConfigurationManager.GetSetting("DatabaseConnectionString") for an Azure worker role. However, changing 4 variables every time I do a release is getting tiring.
I see a lot of posts that get close to what I want, but they're geared towards C#. For reasons beyond my influence, this project is written in VB.NET. It seems I have 2 options. First, I could call the MSBuild process with a couple of defined properties, passing them to the .metaproj build file, but I don't know how to get them to be used in VB code. That's preferable, but, at this point, I'm starting to doubt that this is possible.
I've been able to set some pre-processor constants, to be recognized in #If-#Else directives.
#If DEBUG = True Then
BarStaticItemVersion.Caption = String.Format("Version: {0}", "1.18.0.xxx")
#Else
BarStaticItemVersion.Caption = String.Format("Version: {0}", "1.18.0.133")
#End If
msbuild CalbertNG.sln.metaproj /t:Rebuild /p:DefineConstants="DEBUG=False"
This seems to work, though I need to Rebuild to change the value of that constant. Should I have to? Should Build be enough? Is this normal, or an indication that I don't have something set quite right?
I've seen other posts that talk about pre-processing the source files with some other builder, like Ant, but that seems like overkill. It feels like I'm close here. But I want to zoom out and ask, from a clean sheet of paper, if you're given 2 variables which need to change per environment, you're using VB.NET, and you want to incorporate those variable values in an automated VS Team Services build process upon code check-in, what's the best way to do it? (I want to define the variables in the VSTS panel, but this just passes them to my builder, so I have to know how to parse the call to MSBuild to make these useful.)
I can control picking between 2 static strings, now, via compiler directives, but I'd really like to reference the Build.BuildNumber that comes out of the MSBuild process to display to the user, and, if I can do that, I can just feed the variables for database and blob container via the same mechanism, and skip the pre-processor.
You've already found the way you can pass data from the MsBuild Arguments directly into the code. An alternative is to use the Condition Attribute in your project files to make certain property groups optional, it allows you to even include specific files conditionally. You can control conditions by passing in /p:ConditionalProperty=value on the MsBuild command. This at least ensures people use a set of values that make sense together.
The problem is that when MsBuild is running in Incremental mode it is likely to not process your changes (as you've noticed), the reason for this, is that the input files remain unchanged since the last build and are all older than the last generated output files.
To by-pass this behavior you'd normally create a separate solution configuration and override the output location for all projects to be unique for that configuration. Combined with setting the Compiler constants for that specific configuration you're ensured that when building that Configuration/Platform combination, incremental builds work as intended.
I do want to echo some of the comments from JerryM and Daniel Mann. Some items are better stored in else where or updated before you actually start the compile phase.
Possible solutions:
Store your configuration data in config files and use Configuration Transformation to generate the right config file base don the selected solution configuration. The process is explained on MSDN. To enable configuration transformation on all project types, you can use SlowCheetah.
Store your ocnfiguration data in the config files and use MsDeploy and specify a Parameters.xml file that matches the deploy package. It will perform the transformation on deploy time and will actually allow your solution to contain a standard config file you use at runtime, plus a publish profile which will post-process your configuration. You can use a SetParameters.xml file to override the variables at deploy time.
Create an installer project (such as through Wix) and merge the final configuration at install time (similar to the MsDeploy). You could even provide a UI which prompts for specific values (and can supply default values).
Use a CI server, like the new TFS/VSTS 2015 task based build engine and combine it with a task that can search&replace tokens, like the Replace Tokens task, Tokenization Task, Colin's ALM Corner Build and Release Tasks. And a whole bunch that specifically deal with versioning. Handling these things in the CI server also allows you to do a quick build locally at all times and do these relatively expensive steps on the build server (patching source code breaks incremental build in MsBuild, because there are always newer input files.
When talking specifically about versioning, there are a number of ways to set the AssemblyVersion and AssemblyFileVersion just before compile time, usually it involves overriding the AssemblyInfo.cs file before compilation. Your code could then use reflection to read the value at runtime. You can use the AssemblyInformationalversion to specify something like you do in the example above which contains .xxx or other text. It also ensures that the version displayed always reflects the information obtained when reading the file properties through Windows Explorer.

SSDT - Build Deployment Script without dacpac

I've got a question about building a deployment script using SSDT.
Could anyone tell me if it's possible to build a deployment script using SQLPackage.exe where the source file is NOT a dacpac file, but uses the .sql files instead?
To give some background, I've created a project in Visual Studio 2012 for my database schema. This works great, and SSDT builds the folder structure without a problem (functions, stored procedures etc which contain all the .sql files).
Here's the problem - the database in question is from a legacy system, and is riddled with errors. Most of these errors we don't care about anymore and it's not practical or safe to fix them all, so for years we've basically ignored them. However it means we can't build the project and therefore can't generate the dacpac file. Now this doesn't prevent us from doing the schema compare and syncing the database with the file system (a local mercurial repository). However it does seemingly prevent us from building a deployment script.
What I'm looking for is a way of building the deployment script using SQLPackage.exe without having to generate the dacpac file. I need to use the .sql files in the file system instead. Visual Studio will produce a script of the differences without building the dacpac, so this makes me think it must be possible to do it using SQLPackage.exe using one of the parameters.
Here's an example of SQLPackage.exe which I'd like to adapt to use the .sql files instead of the dacpac:
sqlpackage.exe /Action:Script /SourceFile:"E:\SourceControl\Project\Database
\test_SSDTProject\bin\Debug\test_SSDTProject.dacpac" /TargetConnectionString:"Data
Source=local;Initial Catalog=TestDB;User ID=abc;Password=abc" /OutputPath:"C:
\temp\hbupdate.sql" /OverwriteFiles:true /p:IgnoreExtendedProperties=True
/p:IgnorePermissions=True /p:IgnoreRoleMembership=True /p:DropObjectsNotInSource=True
This works fine because it uses the dacpac file. However I need to point it at the folder structure where the .sql files are instead.
Any help would be great.
As has been suggested in comments, I think that biting the bullet and fixing the errors is the way ahead. You say
it's not practical or safe to fix them all,
but I think you should give this a bit more thought. I have recently been in a similar situation to you, and the key to emerging from it is to realise that the operational risk associated with dropping procedures and functions that will throw an exception as soon as they are called is zero.
Note that this does not apply if the reason these objects won't build is that they contain cross-database or cross-server references that are present in production but not in your project; this is a separate problem altogether, but also a solvable one.
Nor am I in favour of "exclude from build" as an alternative to "delete"; a while ago I saw a project where this technique had been deployed extensively; it makes it harder to see what does what from the source files and I am now of the opinion that "Build Action=None" is simply "commenting out the bits that don't work" for the Snapchat generation.
The key to all of this, of course, is source control. This addresses the residual risk that one day you might indeed want to implement a working version of one of your currently non-working procedures, using the non-working code as a starting point. It also obviates the need to keep stuff hanging around in the solution using Build Action=None, as one can simply summon an earlier revision of the code that contained the offending objects.
If my experience is any guide, 60 build errors is nothing; these could easily be caused by references to three or four objects that no longer exists, and can be consigned to the dustbin of source control with some enthusiastic use of the "Delete" key.
Do you have a copy of SQL Compare at your disposal? If not, it might be worth downloading the trial to see if it will work in your scenario.
Here are the available switches:
http://documentation.red-gate.com/display/SC10/Switches+used+in+the+command+line
At the very least you'll need to specify the following:
/scripts1:
/server2:
/database2:
/ScriptFile:

Batch rename with MSBuild

I just joined a team that has no CI process in place (not even an overnight build) and some sketchy development practices. There's desire to change that, so I've now been tasked with creating an overnight build. I've followed along with this series of articles to: create a master solution that contains all our projects (some web apps, a web service, some Windows services, and couple off tools that compile to command line executables); created an MSBuild script to automatically build, package, and deploy our products; and created a .cmd file to do it all in one click. Here's a task that I'm trying to accomplish now as part of all this:
The team currently has a practice of keeping the web.config and app.config files outside of source control, and to put into source control files called web.template.config and app.template.config. The intention is that the developer will copy the .template.config file to .config in order to get all of the standard configuration values, and then be able to edit the values in the .config file to whatever he needs for local development/testing. For obvious reasons, I would like to automate the process of renaming the .template.config file to .config. What would be the best way to do this?
Is it possible to do this in the build script itself, without having to stipulate within the script every individual file that needs to be renamed (which would require maintenance to the script any time a new project is added to the solution)? Or might I have to write some batch file that I simply run from the script?
Furthermore, is there a better development solution that I can suggest that will make this entire process unnecessary?
After a lot of reading about Item Groups, Targets, and the Copy task, I've figured out how to do what I need.
<ItemGroup>
<FilesToCopy Include="..\**\app.template.config">
<NewFilename>app.config</NewFilename>
</FilesToCopy>
<FilesToCopy Include="..\**\web.template.config">
<NewFilename>web.config</NewFilename>
</FilesToCopy>
<FilesToCopy Include"..\Hibernate\hibernate.cfg.template.xml">
<NewFilename>hibernate.cfg.xml</NewFilename>
</FilesToCopy>
</ItemGroup>
<Target Name="CopyFiles"
Inputs="#(FilesToCopy)"
Outputs="#(FilesToCopy->'%(RootDir)%(Directory)%(NewFilename)')">
<Message Text="Copying *.template.config files to *.config"/>
<Copy SourceFiles="#(FilesToCopy)"
DestinationFiles="#(FilesToCopy->'%(RootDir)%(Directory)%(NewFilename)')"/>
I create an item group that contains the files that I want to copy. The ** operator tells it to recurse through the entire directory tree to find every file with the specified name. I then add a piece of metadata to each of those files called "NewFilename". This is what I will be renaming each file to.
This snippet adds every file in the directory structure named app.template.config and specifies that I will be naming the new file app.config:
<FilesToCopy Include="..\**\app.template.config">
<NewFilename>app.config</NewFilename>
</FilesToCopy>
I then create a target to copy all of the files. This target was initially very simple, only calling the Copy task in order to always copy and overwrite the files. I pass the FilesToCopy item group as the source of the copy operation. I use transforms in order to specify the output filenames, as well as my NewFilename metadata and the well-known item metadata.
The following snippet will e.g. transform the file c:\Project\Subdir\app.template.config to c:\Project\Subdir\app.config and copy the former to the latter:
<Target Name="CopyFiles">
<Copy SourceFiles="#(FilesToCopy)"
DestinationFiles="#(FilesToCopy->'%(RootDir)%(Directory)%(NewFileName)')"/>
</Target>
But then I noticed that a developer might not appreciate having his customized web.config file being over-written every time the script is run. However, the developer probably should get his local file over-written if the repository's web.template.config has been modified, and now has new values in it that the code needs. I tried doing this a number of different ways--setting the Copy attribute "SkipUnchangedFiles" to true, using the "Exist()" function--to no avail.
The solution to this was building incrementally. This ensures that files will only be over-written if the app.template.config is newer. I pass the names of the files as the target input, and I specify the new file names as the target output:
<Target Name="CopyFiles"
Input="#(FilesToCopy)"
Output="#(FilesToCopy->'%(RootDir)%(Directory)%(NewFileName)')">
...
</Target>
This has the target check to see if the current output is up-to-date with respect to the input. If it isn't, i.e. the particular .template.config file has more recent changes than its corresponding .config file, then it will copy the web.template.config over the existing web.config. Otherwise, it will leave the developer's web.config file alone and unmodified. If none of the specified files needs to be copied, then the target is skipped altogether. Immediately after a clean repository clone, every file will be copied.
The above turned out be a satisfying solution, as I've only started using MSBuild and I'm surprised by its powerful capabilities. The only thing I don't like about it is that I had to repeat the exact same transform in two places. I hate duplicating any kind of code, but I couldn't figure out how to avoid this. If anyone has a tip, it'd be greatly appreciated. Also, while I think the development practice that necessitates this totally sucks, this does help in mitigating that suck factor.
Short answer:
Yes, you can (and should) automate this. You should be able to use MSBuild Move task to rename files.
Long answer:
It is great that there is a desire to change from a manual process to an automatic one. There are usually very few real reasons not to automate. Your build script will act as living documentation of how build and deployment actually works. In my humble opinion, a good build script is worth a lot more than static documentation (although I am not saying you should not have documentation - they are not mutually exclusive after all). Let's address your questions individually.
What would be the best way to do this?
I don't have a full understanding of what configuration you are storing in those files, but I suspect a lot of that configuration can be shared across the development team.
I would suggest raising the following questions:
Which of the settings are developer-specific?
Is there any way to standardise local developer machines so that settings could be shared?
Is it possible to do this in the build script itself, without having to stipulate within the script every individual file that needs to be renamed?
Yes, have a look at MSBuild Move task. You should be able to use it to rename files.
...which would require maintenance to the script any time a new project is added to the solution?
This is inevitable - your build scripts must evolve together with your solution. Accept this as a fact and include in your estimates time to make changes to your build scripts.
Furthermore, is there a better development solution that I can suggest that will make this entire process unnecessary?
I am not aware of all the requirements, so it is hard to recommend something very specific. I can say suggest this:
Create a shared build script for your solution
Automate manual tasks as much as possible (within reason)
If you are struggling to automate something - it could be an indicator of an area that needs to be rethought/redesigned
Make sure your team mates understand how the build works and are able to make changes to it themselves - don't "own" the build and become a bottleneck
Bear in mind that going from no build script to full automation is not an overnight process. Be patient and first focus on automating areas that are causing the most pain.
If I have misinterpreted any of your questions, please let me know and I will update the answer.

How to determine where, or if, a variable is used in an SSIS package

I've inherited a collection of largely undocumented ssis packages. The entry point package (ie: the one that forks off in a variety of directions to call other packages) defines a number of variables. I would like to know how these variables are being used, but there doesn't seem to be an equivalent of "right click/Find All References"
Is there a reliable way to determine where these variables are being used?
A hackish way would be to open the dtsx file in a text editor/xml viewer and search for the variable name.
If it's being used in expressions, it should show it and you can trace the xml tree back up until you find the object it's being used on.
You can use the bids helper add-in thats gives you visual feedback on where variables are used in your package. Thats makes it very fast and easy to detect them.Besides that, it offers several other valueable features.
Check out: http://bidshelper.codeplex.com/

Process for reducing the size of an executable

I'm producing a hex file to run on an ARM processor which I want to keep below 32K. It's currently a lot larger than that and I wondered if someone might have some advice on what's the best approach to slim it down?
Here's what I've done so far
So I've run 'size' on it to determine how big the hex file is.
Then 'size' again to see how big each of the object files are that link to create the hex files. It seems the majority of the size comes from external libraries.
Then I used 'readelf' to see which functions take up the most memory.
I searched through the code to see if I could eliminate calls to those functions.
Here's where I get stuck, there's some functions which I don't call directly (e.g. _vfprintf) and I can't find what calls it so I can remove the call (as I think I don't need it).
So what are the next steps?
Response to answers:
As I can see there are functions being called which take up a lot of memory. I cannot however find what is calling it.
I want to omit those functions (if possible) but I can't find what's calling them! Could be called from any number of library functions I guess.
The linker is working as desired, I think, it only includes the relevant library files. How do you know if only the relevant functions are being included? Can you set a flag or something for that?
I'm using GCC
General list:
Make sure that you have the compiler and linker debug options disabled
Compile and link with all size options turned on (-Os in gcc)
Run strip on the executable
Generate a map file and check your function sizes. You can either get your linker to generate your map file (-M when using ld), or you can use objdump on the final executable (note that this will only work on an unstripped executable!) This won't actually fix the problem, but it will let you know of the worst offenders.
Use nm to investigate the symbols that are called from each of your object files. This should help in finding who's calling functions that you don't want called.
In the original question was a sub-question about including only relevant functions. gcc will include all functions within every object file that is used. To put that another way, if you have an object file that contains 10 functions, all 10 functions are included in your executable even if one 1 is actually called.
The standard libraries (eg. libc) will split functions into many separate object files, which are then archived. The executable is then linked against the archive.
By splitting into many object files the linker is able to include only the functions that are actually called. (this assumes that you're statically linking)
There is no reason why you can't do the same trick. Of course, you could argue that if the functions aren't called the you can probably remove them yourself.
If you're statically linking against other libraries you can run the tools listed above over them too to make sure that they're following similar rules.
Another optimization that might save you work is -ffunction-sections, -Wl,--gc-sections, assuming you're using GCC. A good toolchain will not need to be told that, though.
Explanation: GNU ld links sections, and GCC emits one section per translation unit unless you tell it otherwise. But in C++, the nodes in the dependecy graph are objects and functions.
On deeply embedded projects I always try to avoid using any standard library functions. Even simple functions like "strtol()" blow up the binary size. If possible just simply avoid those calls.
In most deeply embedded projects you don't need a versatile "printf()" or dynamic memory allocation (many controllers have 32kb or less RAM).
Instead of just using "printf()" I use a very simple custom "printf()", this function can only print numbers in hexadecimal or decimal format not more. Most data structures are preallocated at compile time.
Andrew EdgeCombe has a great list, but if you really want to scrape every last byte, sstrip is a good tool that is missing from the list and and can shave off a few more kB.
For example, when run on strip itself, it can shave off ~2kB.
From an old README (see the comments at the top of this indirect source file):
sstrip is a small utility that removes the contents at the end of an
ELF file that are not part of the program's memory image.
Most ELF executables are built with both a program header table and a
section header table. However, only the former is required in order
for the OS to load, link and execute a program. sstrip attempts to
extract the ELF header, the program header table, and its contents,
leaving everything else in the bit bucket. It can only remove parts of
the file that occur at the end, after the parts to be saved. However,
this almost always includes the section header table, and occasionally
a few random sections that are not used when running a program.
Note that due to some of the information that it removes, a sstrip'd executable is rumoured to have issues with some tools. This is discussed more in the comments of the source.
Also... for an entertaining/crazy read on how to make the smallest possible executable, this article is worth a read.
Just to double-check and document for future reference, but do you use Thumb instructions? They're 16 bit versions of the normal instructions. Sometimes you might need 2 16 bit instructions, so it won't save 50% in code space.
A decent linker should take just the functions needed. However, you might need compiler & linke settings to package functions for individual linking.
Ok so in the end I just reduced the project to it's simplest form, then slowly added files one by one until the function that I wanted to remove appeared in the 'readelf' file. Then when I had the file I commented everything out and slowly add things back in until the function popped up again. So in the end I found out what called it and removed all those calls...Now it works as desired...sweet!
Must be a better way to do it though.
To answer this specific need:
•I want to omit those functions (if possible) but I can't find what's
calling them!! Could be called from any number of library functions I
guess.
If you want to analyze your code base to see who calls what, by whom a given function is being called and things like that, there is a great tool out there called "Understand C" provided by SciTools.
https://scitools.com/
I have used it very often in the past to perform static code analysis. It can really help to determine library dependency tree. It allows to easily browse up and down the calling tree among other things.
They provide a limited time evaluation, then you must purchase a license.
You could look at something like executable compression.