UNBIASED FACTS: ? Bugzilla & R-devel Mailing Lists: Remain unchanged: Understood as Ticketing platforms for bug pull requests on the R-devel Git Repository. ? Git Repository Options: A) Github (Cloud with Automated backups from GitHub to CRAN Server): https://github.com B) Gitlab (Selfhosted on CRAN): https://about.gitlab.com C) Phabricator (Selfhosted on CRAN): https://www.phacility.com D) Microsoft Codeplex: https://www.codeplex.com E) Others: Unknown GOOGLE TRENDS: https://trends.google.com/trends/explore?date=all&q=Git,Svn,Github,Gitlab EXAMPLE Git Repository on Core Python: https://github.com/python PERSONAL OPINION / MOTIVATION: I think that moving efforts in this direction is important because it would allow a true Open Source Innovation & Open Collaboration in R between: * R Community. * And R-Core. For: * R Bug Fixes. * And Core Feature Wishlist. As anyone would be able to: * Check the unassigned bugs in Bugzilla (apart from R-Core). * And propose bugs fixes by themselves as Pull requests (by mentioning the Bug ID of Bugzilla or the Mailing Lists). This would allow that _individuals_ either from Universities or Companies interested in the Development of R: * apart of donating economical resources to the R Foundation. * could help to maintain core R Code by themselves. Which aligns with the true spirit of R, which shall be done from contributing individuals, for individuals themselves. It would also allow to put the focus on the precise lines of code changed with each Commit, and revert changes in an easy way, without verbose E-mails: Tidy, Clean, Maintainable, and Fast. At last, I noticed R-devel Archives do not have an E-mail Id (Unique Unsigned Integer), so it would be a good idea to add one for pull requests if Git was adopted. Juan
Community Feedback: Git Repository for R-Devel
5 messages · Juan Telleria, Mark van der Loo
This question has been discussed before on this list: http://r.789695.n4.nabble.com/Why-R-project-source-code-is-not-on-Github-td4695779.html See especially Jeroen's answer. Best, Mark Op do 4 jan. 2018 om 01:11 schreef Juan Telleria <jtelleriar at gmail.com>:
UNBIASED FACTS: ? Bugzilla & R-devel Mailing Lists: Remain unchanged: Understood as Ticketing platforms for bug pull requests on the R-devel Git Repository. ? Git Repository Options: A) Github (Cloud with Automated backups from GitHub to CRAN Server): https://github.com B) Gitlab (Selfhosted on CRAN): https://about.gitlab.com C) Phabricator (Selfhosted on CRAN): https://www.phacility.com D) Microsoft Codeplex: https://www.codeplex.com E) Others: Unknown GOOGLE TRENDS: https://trends.google.com/trends/explore?date=all&q=Git,Svn,Github,Gitlab EXAMPLE Git Repository on Core Python: https://github.com/python PERSONAL OPINION / MOTIVATION: I think that moving efforts in this direction is important because it would allow a true Open Source Innovation & Open Collaboration in R between: * R Community. * And R-Core. For: * R Bug Fixes. * And Core Feature Wishlist. As anyone would be able to: * Check the unassigned bugs in Bugzilla (apart from R-Core). * And propose bugs fixes by themselves as Pull requests (by mentioning the Bug ID of Bugzilla or the Mailing Lists). This would allow that _individuals_ either from Universities or Companies interested in the Development of R: * apart of donating economical resources to the R Foundation. * could help to maintain core R Code by themselves. Which aligns with the true spirit of R, which shall be done from contributing individuals, for individuals themselves. It would also allow to put the focus on the precise lines of code changed with each Commit, and revert changes in an easy way, without verbose E-mails: Tidy, Clean, Maintainable, and Fast. At last, I noticed R-devel Archives do not have an E-mail Id (Unique Unsigned Integer), so it would be a good idea to add one for pull requests if Git was adopted. Juan [[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Thank you Mark, this is what I was looking for. On Sunday I will read again in detail previous discussion's facts, and attach the pros and cons here, so that they remain for the future, and the topic can be closed. Juan El 4 ene. 2018 11:06 a. m., "Mark van der Loo" <mark.vanderloo at gmail.com> escribi?:
This question has been discussed before on this list: http://r.789695.n4.nabble.com/Why-R-project-source-code-is- not-on-Github-td4695779.html See especially Jeroen's answer. Best, Mark Op do 4 jan. 2018 om 01:11 schreef Juan Telleria <jtelleriar at gmail.com>:
UNBIASED FACTS: ? Bugzilla & R-devel Mailing Lists: Remain unchanged: Understood as Ticketing platforms for bug pull requests on the R-devel Git Repository. ? Git Repository Options: A) Github (Cloud with Automated backups from GitHub to CRAN Server): https://github.com B) Gitlab (Selfhosted on CRAN): https://about.gitlab.com C) Phabricator (Selfhosted on CRAN): https://www.phacility.com D) Microsoft Codeplex: https://www.codeplex.com E) Others: Unknown GOOGLE TRENDS: https://trends.google.com/trends/explore?date=all&q=Git,Svn,Github,Gitlab EXAMPLE Git Repository on Core Python: https://github.com/python PERSONAL OPINION / MOTIVATION: I think that moving efforts in this direction is important because it would allow a true Open Source Innovation & Open Collaboration in R between: * R Community. * And R-Core. For: * R Bug Fixes. * And Core Feature Wishlist. As anyone would be able to: * Check the unassigned bugs in Bugzilla (apart from R-Core). * And propose bugs fixes by themselves as Pull requests (by mentioning the Bug ID of Bugzilla or the Mailing Lists). This would allow that _individuals_ either from Universities or Companies interested in the Development of R: * apart of donating economical resources to the R Foundation. * could help to maintain core R Code by themselves. Which aligns with the true spirit of R, which shall be done from contributing individuals, for individuals themselves. It would also allow to put the focus on the precise lines of code changed with each Commit, and revert changes in an easy way, without verbose E-mails: Tidy, Clean, Maintainable, and Fast. At last, I noticed R-devel Archives do not have an E-mail Id (Unique Unsigned Integer), so it would be a good idea to add one for pull requests if Git was adopted. Juan [[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
1 day later
I attach a basic State of Art: ########################################################################################################################################## # State of Art Analysis of Git vs SVN ########################################################################################################################################## Scopus Keywords: GIT AND SVN ########################################################################################################################################## # 1. How Do Centralized (SVN) and Distributed Version Control (GIT) Systems Impact Software Changes? (22 Citations; Published: 2014) ########################################################################################################################################## 1.1 Paper Conclusions We found that the use of CVCS and DVCS have observable effects on developers, teams and processes. The most surprising findings are that (i) the size of commits in DVCS was smaller than in CVCS, (ii) developers split commits (group changes by intent) more often in DVCS, and (iii) DVCS commits are more likely to reference issue tracking labels. These show that DVCS contain higher quality commits compared to CVCS due to their smaller size, cohesive changes and the presence of issue tracking labels. The survey provided valuable information on why developers prefer one paradigm versus the other. DVCS are preferred because of killer features, such as the ability of committing locally. In contrast CVCS are preferred for their ease of use and faster learning curve. 1.2 Full Paper http://dig.cs.illinois.edu/papers/ICSE14_Caius.pdf ########################################################################################################################################## # 2 Version Control with Git (Book: J Loeliger, M McCullough ? 2012) ########################################################################################################################################## 2.1 Book Introduction ***The Birth of Git*** Often, when there is discord between a tool and a project, the developers simply create a new tool. Indeed, in the world of software, the temptation to create new tools can be deceptively easy and inviting. In the face of many existing version control systems, the decision to create another shouldn?t be made casually. However, given a critical need, a bit of insight, and a healthy dose of motivation, forging a new tool can be exactly the right course. Git, affectionately termed ?the information manager from hell? by its creator is such a tool. Although the precise circumstances and timing of its genesis are shrouded in political wrangling within the Linux Kernel community, there is no doubt that what came from that fire is a well-engineered version control system capable of supporting worldwide development of software on a large scale. Prior to Git, the Linux Kernel was developed using the commercial BitKeeper VCS, which provided sophisticated operations not available in then-current, free software version control systems such as RCS and CVS. However, when the company that owned BitKeeper placed additional restrictions on its ?free as in beer? version in the spring of 2005, the Linux community realized that BitKeeper was no longer a viable solution. Linus looked for alternatives. Eschewing commercial solutions, he studied the free software packages but found the same limitations and flaws that led him to reject them previously. What was wrong with the existing VCS systems? What were the elusive missing features or characteristics that Linus wanted and couldn?t find? ***Facilitate distributed development*** There are many facets to ?distributed development,? and Linus wanted a new VCS that would cover most of them. It had to allow parallel as well as independent and simultaneous development in private repositories without the need for constant synchronization with a central repository, which could form a development bottleneck. It had to allow multiple developers in multiple locations even if some of them were offline temporarily. ***Scale to handle thousands of developers*** It isn?t enough just to have a distributed development model. Linus knew that thousands of developers contribute to each Linux release, so any new VCS had to handle a very large number of developers, whether they were working on the same or on different parts of a common project. And the new VCS had to be able to integrate all of their work reliably. ***Perform quickly and efficiently*** Linus was determined to ensure that a new VCS was fast and efficient. In order to support the sheer volume of update operations that would be made on the Linux Kernel alone, he knew that both individual update operations and network transfer operations would have to be very fast. To save space and thus transfer time, compression and ?delta? techniques would be needed. Using a distributed model instead of a centralized model also ensured that network latency would not hinder daily development. ***Maintain integrity and trust*** Because Git is a distributed revision control system, it is vital to obtain absolute assurance that data integrity is maintained and is not somehow being altered. How do you know the data hasn?t been altered in transition from one developer to the next, or from one repository to the next? For that matter, how do you know that the data in a Git repository is even what it purports to be? Git uses a common cryptographic hash function, called Secure Hash Function (SHA1), to name and identify objects within its database. Although perhaps not absolute, in practice it has proven to be solid enough to ensure integrity and trust for all of Git?s distributed repositories. ***Enforce accountability*** One of the key aspects of a version control system is knowing who changed files, and if at all possible, why. Git enforces a change log on every commit that changes a file. The information stored in that change log is left up to the developer, project requirements, management, convention, etc. Git ensures that changes will not happen mysteriously to files under version control because there is an accountability trail for all changes. ***Immutability*** Git?s repository database contains data objects that are immutable. That is, once they have been created and placed in the database, they cannot be modified. They can be recreated differently, of course, but the original data cannot be altered without consequences. The design of the Git database means that the entire history stored within the version control database is also immutable. Using immutable objects has several advantages, including very quick comparison for equality. ***Atomic transactions*** With atomic transactions, a number of different but related changes are performed either all together or not at all. This property ensures that the version control database is not left in a partially changed (and hence possibly corrupted) state while an update or commit is happening. Git implements atomic transactions by recording complete, discrete repository states that cannot be broken down into individual or smaller state changes. ***Support and encourage branched development*** Almost all VCSs can name different genealogies of development within a single project. For instance, one sequence of code changes could be called ?development? while another is referred to as ?test.? Each version control system can also split a single line of development into multiple lines and then unify, or merge, the disparate threads. As with most VCSs, Git calls a line of development a branch and assigns each branch a name. Along with branching comes merging. Just as Linus wanted easy branching to foster alternate lines of development, he also wanted to facilitate easy merging of those branches. Because branch merging has often been a painful and difficult operation in version control systems, it would be essential to support clean, fast, easy merging. ***Complete repositories*** So that individual developers needn?t query a centralized repository server for historical revision information, it was essential that each repository have a complete copy of all historical revisions of every file. ***A clean internal design*** Even though end users might not be concerned about a clean internal design, it was important to Linus and ultimately to other Git developers as well. Git?s object model has simple structures that capture fundamental concepts for raw data, directory structure, recording changes, etc. Coupling the object model with a globally unique identifier technique allowed a very clean data model that could be managed in a distributed development environment. ***Be free, as in freedom*** ?Nuff said. Given a clean slate to create a new VCS, many talented software engineers collaborated and Git was born. Necessity was the mother of invention again! 1.2.2 Book Link https://books.google.es/books?hl=en&lr=&id=aM7-Oxo3qdQC&oi=fnd&pg=PR3&dq=GIT+SVN&ots=39uhIKPlpc&sig=PmxABWMem-h4Fp1-JR-4C2HTwUY&redir_esc=y#v=onepage&q=GIT%20SVN&f=false Chapter 18: ?Using Git with Subversion Repositories?, is of special interest. You can find the full book accessible with a basic search in Google: ?Version Control with Git? filetype:pdf Juan
######################################################################################################################################## # R-devel Archives: ?Why R-project source code is not on Github" [Summary: Aug 2014] ######################################################################################################################################## Key Citations (Pros and Cons) from R-devel Archives ######################################################################################################################################## # GIT PROS ######################################################################################################################################## 1. [Simon Urbanek R-devel Aug 21, 2014] Github just makes it much easier to create and post patches to the project - it has nothing to do with write access - typically on Github the community has no write access, either. 2. [Simon Urbanek R-devel Aug 21, 2014] Using pull requests is certainly much less fragile than e-mails and patches are based on forked branches, so you can directly build the patched version if you want without manually applying the patch - and you see the whole history so you can pick out things logically. 3. [Simon Urbanek R-devel Aug 21, 2014] You can comment on individual patches to discuss them and even individual commits - often leading to a quick round trip time of revising it. 4. [Yihui Xie-2 R-devel Aug 22, 2014] Sometimes the patches are not worth emails back and forth, such as the correction of typos. I cannot think of anything else that is more efficient than being able to discuss the patch right in the lines of diff's. 5. [Gaurav Sehrawat R-devel Aug 24, 2014] Bridging gap between web2.0 and web1.0 development methodologies & thus passing code to younger generation . 6. [Jeroen Ooms. R-devel Aug 24, 2014] By now all activity of r-base [1] cran [2] and r-forge [3] is continuously mirrored on Github, which already gives unprecedented insight in developments. At least several r-core members [4,5,6,7,8] have been spotted on Github, and this years useR2014 website [9] was developed and hosted completely on Github. It seems like a matter of time until the benefits outweigh the cost of a migration, even to the more conservative stakeholders. 7. [Spencer Graves-2 R-devel Aug 24, 2014] We could use Git without Github (Gitlab, ?) 8. [Spencer Graves-2 R-devel Aug 24, 2014] It should be easy and cheap for someone to program a server to make daily backup copies of whatever we want from Github. This could provide an insurance policy in case events push the group to leave Github. 9. [Brian Rowe R-devel Aug 24, 2014] One thing to note about git vs svn is that each git repository is a complete repository containing the full history, so despite github acting as a central repository, it is not the same as a central svn repository. In svn the central repository is typically the only repository with a complete revision history, but that is not the case with git. 10. [Simon Urbanek R-devel Aug 25, 2014] There is no point in using git alone (Github actually supports direct SVN access as well). 11. [Simon Urbanek R-devel Aug 25, 2014] Github: The whole point are the collaborative features. ######################################################################################################################################## # GIT CONS ######################################################################################################################################## 1. [Marc Schwartz R-devel Aug 21, 2014] Since the current SVN based system works well for them (R Core) and provides restricted write access that they can control, there is no motivation to move to an alternative version control system unless they would find it to be superior for their own development processes. 2. [Jeroen Ooms. R-devel Aug 24, 2014] These things take time 3. [Jeroen Ooms. R-devel Aug 24, 2014] However moving development of a medium sized, 20 year old open source project is not trivial. You are dealing with a large commit history and many contributors that all have to overhaul their familiar tools and development practices overnight. 4. [Jeroen Ooms. R-devel Aug 24, 2014] There is also the infrastructure of nightly builds and CRAN r-devel package checking that relies on the svn. 5. [Jeroen Ooms. R-devel Aug 24, 2014] Moreover moving to Github means changes in communications, for example replacing the current bug tracking system to Github "issues". 6. [Jeroen Ooms. R-devel Aug 24, 2014] In addition, several members are skeptical about putting source code in the hands of a for-profit US company, and other legal issues. 7. [Jeroen Ooms. R-devel Aug 24, 2014] The most critical piece of making such a transition is a detailed proposal outlining what the migration would involve, the cost/benefits, a planning, and someone that is willing to take the lead. Only on the basis of such a serious proposal you can have a discussion in which everyone can voice concerns, be assured that his/her interests are secure, and the idea can eventually be put up for a vote. 8. [Kirill M?ller. Jan 05, 2018 (With Permission)] Migration Technical Problems: - Keeping monotonous revision numbers seems to be a requirement for migrating to GitHub. - It may be more difficult to apply patches produced by "git format-patch" or "git diff" (obtained from Winston's GitHub mirror) to an SVN working copy, because patches created by Git are missing the SVN base revision. (This is an obstacle to adopting GitHub gradually.) * NOTE: Paper (Previous Transition Experience): http://iopscience.iop.org/article/10.1088/1742-6596/898/7/072024/pdf