Merging Xcode project files - objective-c

There are often conflicts in the Xcode project file (Project.xcodeproj/project.pbxproj) when merging branches (I'm using git). Sometimes it's easy, but at times I end up with a corrupt project file and have to revert. In the worst case I have to fix up the project file manually in a second commit (which can be squashed with the previous) by dragging in files etc.
Does anyone have tips for how to handle merge conflicts in big and complex files like the Xcode project file?
EDIT-- Some related questions:
Git and pbxproj
Should I merge .pbxproj files with git using merge=union?
RESOURCES:
http://www.alphaworks.ibm.com/tech/xmldiffmerge
http://www2.informatik.hu-berlin.de/~obecker/XSLT/#merge
http://tdm.berlios.de/3dm/doc/thesis.pdf
http://www.cs.hut.fi/~ctl/3dm/
http://el4j.svn.sourceforge.net/viewvc/el4j/trunk/el4j/framework/modules/xml_merge/

Break your projects up into smaller, more logical libraries/packages. Massive projects are regularly the sign of a bad design, like the object that does way too much or is way too large.
Design for easy rebuilding -- this also helps if you're writing programs which must be built by multiple tools or IDEs. Many of my 'projects' can be reconstructed by adding one directory.
Remove extraneous build phases. Example: I've removed the "Copy Headers" build phase from all projects. Explicitly include the specific files via the include directive.
Use xcconfig files wherever possible. This also reduces the number of changes you must make when updating your builds. xcconfig files define a collection of build settings, and support #include. Of course, you then delete the (majority of) user defined settings from each project and target when you define the xcconfig to use.
For target dependencies: create targets which perform logical operations, rather than physical operations. This is usually a shell script target or aggregate target. For example: "build dependencies", "run all unit tests", "build all", "clean all". then you do not have to maintain every dependency change every step of a way - it's like using references.
Define a common "Source Tree" for your code, and a second for 3rd party sources.
There are external build tools available. This may be an option for you (at least, for some of your targets).
At this point, a xcodeproj will be much simpler. It will require fewer changes, and be very easy to reconstruct. You can go much further with these concepts to further reduce the complexity of your projects and builds.

You might want to try https://github.com/simonwagner/mergepbx/
It is a script that will help you to merge Xcode project files correctly. Note that it is still alpha.
Disclaimer: I am the author of mergepbx.

The best way I have found is to instruct Git to treat the .pbxproj file as a binary. This prevents messy merges.
Add this to your .gitatributes file:
*.pbxproj -crlf -diff -merge

To compare two Xcode projects open open FileMerge (open xcode and select Xcode (from the manu pane) --> Open developer tools --> FileMerge).
now click "left" button and open xcode project main directory.
click "right" button and open xcode project main directory to compare.
Now click "merge" button!
Thats it!

Another option to consider which may help to reduce the number of times you experience the problem. To explain, I'll call the branch that team members' branches come from the "develop" branch.
Have a convention in your team that when the project file is modified, the changes (along with any other changes required to ensure the build integrity) are committed in a separate commit. That commit is then cherry picked onto the develop branch. Other team members who plan to modify the project file in their branch can then either cherry pick into their branch or rebase their branch on the latest develop. This approach requires communication across the team and some discipline. As I said, it won't always be possible; on some projects it might help a lot and on some projects it might not.

Related

Source control in SSIS and Concurrent work on dtsx file

I am working on building a new SSIS project from scratch. I want to work with couple of my teammates. I was hoping to get a suggestion on how we can have some have some source control, so that few of us can work concurrently on the same SSIS project (same dtsx file, building new packages.)
Version:
SQL Server Integration Service v11
Microsoft Visual Studio 2010
It is my experience that there are two opportunities for any source control system and SSIS projects to get out of whack: adding new items to the project and concurrent changes to an existing package.
Adding new items
An SSIS project has the .dtproj extension. Inside there, it's "just" XML defining what all belongs to the project. At least for 2005/2008 and 2012+ on the package deployment model. The 2012+ project deployment model carries a good bit more information about the state of the packages in the project.
When you add new packages (or project level connection managers or .biml files) the internal structure of the .dtproj file is going to change. Diff tools generally don't handle merging XML well. Or at all really. So, to prevent the need for merging the project definition, you need to find a strategy that works for you team.
I've seen two approaches work well. The first is to upfront define all the packages you think you'll need. DimFoo, DimDate, DimFoo, DimBar, FactBlee. Check that project and the associated empty packages in and everyone works on what is out there. When the initial cut of packages is complete, then you'll ensure everyone is sync'ed up and then add more empty packages to the project. The idea here is that there is one person, usually the lead, who is responsible for changing the "master" project definition and everyone consumes from their change.
The other approach requires communication between team members. If you discover a package needs to be added, communicate with your mates "I need to add a new package - has anyone modified the project?" The answer should be No. Once you've notified that a change to the project definition is coming, make it and immediately commit it. The idea here is that people commit and sync/check in whatever terminology with great frequency. If you as a developer don't keep your local repository up to date, you're going to be in for a bad time.
Concurrent edits
Don't. Really, that's about it. The general problem with concurrent changes to an SSIS package is that in addition to the XML diff issue above, SSIS also includes layout data alongside tasks so I can invert the layout and make things flow from bottom to top or right to left and there's no material change to SSIS package but as Siyual notes "Merging changes in SSIS is nightmare fuel"
If you find your packages are so large and that developers need to make concurrent edits, I would propose that you are doing too much in there. Decompose your packages into smaller, more tightly focused units of work and then control their execution through a parent package. That would allow a better level of granularity to your development and debugging process in addition to avoiding the concurrent edit issue.
A dtsx file is basically just an xml file. Compare it to a bunch of people trying to write the same book. The solution I suggest is to use Team Foundation Server as a source control. That way everyone can check in and out and merge packages. If you really dont have that option try to split your ETL process in logical parts and at the end create a master package that calls each sub packages in the right order.
An example: Let's say you need to import stock data from one source, branches and other company information from an internal server and sale amounts from different external sources. After u have all information gathered, you want to connect those and run some analyses.
You first design the target database entities that you need and the relations. One of your member creates a package that does all the import to staging tables. Another guy maybe handles external sources and parallelizes / optimizes the loading. You would build a package that in merges your staging and production tables, maybe historicizing and so on.
At the end you have a master package that calls each of the mentioned packages and maybe some additional logging or such.
In our multi-developer operation, we follow this rough plan:
Each dev has their own branch, separate from master branch
Once a week, devs push all their changes to remote
One of us pulls all changes, and merges all branches into master, manually resolving .dtproj conflicts as we go
Merge master in all dev branches - now all branches agree
Test in VS
Push all branches to remote, other devs can now pull and keep working
It's not a perfect solution, but it helps quarantine the amount of merge pain we have to experience.
We have large ssis solutions with 20+ packages in one solution, with TFS Git. One project required adding a bunch of new packages to the existing solution. We thought we were smart and knew to assign only one person to work on each new package, 2 people working on the same package would be suicide. Wasn't good enough. When 2 people tried add a different named, new, package at the same time, each showed dtproj as a file that had changed/needed to be checked in and suddenly I found myself looking at the xml for dtproj and trying to figure out which lines to keep (Microsoft should never ask end users to manually edit their internal files, which only they wrote and understand). Billinkc's solutions here are very good and the problem is very real. You may think that Microsoft is the great Wise One, and that your team can always add new packages to an existing solution without conflicts, but you'd be wrong. It also doesn't work to put dtproj in .gitignore. If you do that, you won't see other peoples new packages (actually the .dtsx file will come down in git, but you won't see that package in Solution Explorer because dtproj is what feeds Solution Explorer). This is a current problem (2021) and we are using Visual Studio 2017 Enterprise with SSDT.
To explain this problem to people, git obviously can handle a group of independent, individual files in a directory (like say .bat files) and can add, change, and delete those files easily. The problem comes in when you have a file that is naming, describing, and counting all the files in a directory (what dtproj does). When you have a file like dtproj you are creating a conflict on dtproj itself, when 2 people try to a add a new package at the same time. Your dtproj file has a line that shows the package you added, and my dtproj file shows the package I added, and tfs/git sees that as a Conflict.
Some are suggesting ways to deal with this if you have to add a lot of new packages, my idea is a little different. For the people who have to add new packages, don't work in the primary solution where this problem is, work somewhere else. Probably best to work in the "Projects" directory you get when you install Visual Studio, outside of TFS/Git. Obviously follow all the standards, Variable naming, and Package Configuration conventions for the target Solution. Then when the new packages are ready, give the .dtsx files to your Solution Gatekeeper for them to check in. Only the Gatekeeper can check in new packages using Add From Existing, avoiding conflicts. Once the package is checked in, developers can work on them in the main Solution.

IntelliJ: generate a JAR but do *NOT* including dependencies

In a simple IntelliJ module, I just want to generate a .jar file with my .class files, via IntelliJ IDE commands.
Please be careful before marking this as a "duplicate":
Although I've seen Google and Stack hits with promising titles, I'm not finding a really good answer, or the title is misleading, or its an unanswered question. I cover one possible answer that I've seen before (below), and why I don't think it's a match.
I've used Eclipse in the past, but I'm rather new to IntelliJ.
I've worked with the "Project Structure / Artifacts" stuff. I can generate the giant jar, similar to using "shade", but it's huge because it includes all the nested dependencies. We want the small jar with just this module's class files because the system we're deploying to already has all the other jars in place.
I've seen some references to changing a target directory in the Artifacts dialog box, but it then talks about references being made in the Manifest file, which I don't want. The destination environment already has its java paths setup, so I'm worried that having jar references in this jar will mess that up. If this really is the answer then I'm confused about how it works.
Constraint 1: Can't use command line tools, since I'm actually walking somebody else through these steps, who likely doesn't have command line tools installed in the path, or wouldn't know how to use them, etc. They're not a coder. (Yes, I know this sounds like an odd scenario; I inherited this situation.)
Constraint 2: We want to keep this as a simple IntelliJ project, vs. converting to Maven or Ant or Gradle, etc.
Coworker had the fix.
Short Answer:
Remove all of the other jars/libraries from Output Layout tab of the Artifacts config dialog.
Longer Answer:
You still do File / Project Structure...
Then in the Project Settings, click Artifacts.
And then you still click the plus button (second column) ti create a new artifact setting.
The trick is the "Output Layout" tab in the third column of the window. Highlight all entries EXCEPT the compiled output of your project and delete all those other entries (click the minus button under that tab, directly above your_project.jar)
On my laptop this causes it to pause for a few seconds; I thought it didn't do anything, then finally it reflected that everything was gone except "'my_module' compile output"
Also check the "Build on make" (for when you later do Build / Rebuild Project)
If you need both a full jar and a slim jar, you can have more than one Artifact configuration with different names, and they will default to different output directories.

Is it possible to add a whole directory of source files to CMake command add_executable?

The documentation of CMake's add_executable gives the following specification of the command:
add_executable(<name> [WIN32] [MACOSX_BUNDLE]
[EXCLUDE_FROM_ALL]
source1 [source2 ...])
I now have a rather large project with a lot of sources and was wondering if it is possible to add a directory as a parameter for add_executable instead of specifying each source file individually? If not, are there any best practices or recommendations on how to approach this situation? I can't imagine the only way this would work is by adding each source file individually? How would this work for (really) large projects then, this doesn't seem like an elegant approach...
The best practice is indeed to list all files manually.
In particular, the CMake docs warn about using GLOB for this purpose:
We do not recommend using GLOB to collect a list of source files from
your source tree. If no CMakeLists.txt file changes when a source is
added or removed then the generated build system cannot know when to
ask CMake to regenerate.
This point is somewhat controversial, as many developers prefer that the build system just adjusts automatically to newly added files. The price for this automation is an increase in fragility of the build scripts.
You will have to remember to manually re-run CMake whenever files were added or removed. You also have to ensure that the physical layout of the files on disk matches the logical layout of the projects that you want to build. The latter point is arguably the bigger problem here. By decoupling the build system from the files on disk you add an additional safety net, but you have to pay for it with increased build script maintenance costs.
The biggest disadvantage of the explicit approach is imho that if you forget to add a new file to the CMakeLists, you might be wondering over weird linker errors for a while before realizing your mistake. I personally find the maintenance overhead for this approach acceptable. Sure, you will have a lengthy filelist in your build script, but you do not have to touch it that often and the changes will usually be trivial.
Since this point is somewhat controversial, I won't blame you if you want to use a GLOB for your project. Just be aware of the consequences and be prepared that all the cool kids will laugh at you if your build breaks one day because of this.

In cmake, what is a "project"?

This question is about the project command and, by extension, what the concept of a project means in cmake. I genuinely don't understand what a project is, and how it differs from a target (which I do understand, I think).
I had a look at the cmake documentation for the project command, and it says that the project command does this:
Set a name, version, and enable languages for the entire project.
It should go without saying that using the word project to define project is less than helpful.
Nowhere on the page does it seem to explain what a project actually is (it goes through some of the things the command does, but doesn't say whether that list is exclusive or not). The cmake.org examples take us through a basic build setup, and while it uses the project keyword it also doesn't explain what it does or means, at least not as far as I can tell.
What is a project? And what does the project command do?
A project logically groups a number of targets (that is, libraries, executables and custom build steps) into a self-contained collection that can be built on its own.
In practice that means, if you have a project command in a CMakeLists.txt, you should be able to run CMake from that file and the generator should produce something that is buildable. In most codebases, you will only have a single project per build.
Note however that you may nest multiple projects. A top-level project may include a subdirectory which is in turn another self-contained project. In this case, the project command introduces additional scoping for certain values. For example, the PROJECT_BINARY_DIR variable will always point to the root binary directory of the current project. Compare this with CMAKE_BINARY_DIR, which always points to the binary directory of the top-level project. Also note that certain generators may generate additional files for projects. For example, the Visual Studio generators will create a .sln solution file for each subproject.
Use sub-projects if your codebase is very complex and you need users to be able to build certain components in isolation. This gives you a very powerful mechanism for structuring the build system. Due to the increased coding and maintenance overhead required to make the several sub-projects truly self-contained, I would advise to only go down that road if you have a real use case for it. Splitting the codebase into different targets should always be the preferred mechanism for structuring the build, while sub-projects should be reserved for those rare cases where you really need to make a subset of targets self-contained.

Dividing a project into multiple Xcode project files

An iPad project I have been working on has become bloated with a huge number of files. The application is a prototype and we are considering ways to prevent this when we rewrite it.
One of the members of our team suggests dividing all of the components into separate Xcode projects which will be included in a master Xcode project.
Is this a good idea? What are the reasons, if any, to avoid dividing features/components/controls into separate Xcode projects?
You can add a subsidiary project file to a master project file in Xcode. Just choose "Add File" and add it. When Xcode builds the master it will build the subsidiary as well if needed.
I use a similar system. I often break a project into sub projects just so I can focus on and enforce encapsulation. I write the data model first, then add the app delegate, then specific UI elements. I add each project to the next in turn. This also allows me to go back and change things without as much risk of breaking.
Really, a properly designed objective-c app should be easy to decompose into multiple project. Ideally, all the components are so encapsulate that they don't need any others save the data model.
We have put some of the code in its own project, building a framework which we link against at some of the other projects. It's sometimes annoying that you won't see the implementation files of the framework code right away in another project (by cmd+clicking or cmd+shift+D, or whatever you do normally to navigate). Xcode will only show you the header, you'll have to open the other project and find your file there manually. Not a big deal, but if you look up the code often, it will bother you.
A real problem is that you change the scope of some operations. Stuff like "Find in project" will work on a different file set, which might not be what you want sometimes (trying to find where this method is called / key is used in your whole code, or something); well, there remains Finder / find, so it might be okay. Refactoring is not - all the renaming stuff just breaks, as it will change only the code of the current project, but not of projects referencing this one. If you change interfaces often, better avoid splitting up the project.
A good thing is that you will get less conflicts on your .xcodeproj files (if stored in a shared repository) as someone removing a file from project X won't create a conflict with someone else adding a target on project Y, which where previously the same .xcodeproj (not exactly sure this is a conflict case, but there definitely are some).
Now with Xcode4 you can create a workspace and add all your projects there. Only for documentation purpose :)
To view and modify subproject implementation files, you should add the sub projects directly into the main project.
1 step - Drag and drop the .xcode project files to main project.
2 step - Go to main project TARGETS - > Build Phases. Add subproject target in Target Dependencies. You can also add binary files in Link Binary With Libraries.
3 step - Add subproject source path to main projects header search path.
Go to main project - > Build Settings - > Header Search Paths (e.g $(SRCROOT)/../CoconutKit-master/CoconutKit/Sources )
An Xcode project can have any number of build targets within it, and you can arbitrarily group source files into folders. What makes you think that multiple projects are necessary?