aboutsummaryrefslogtreecommitdiffstats
path: root/BRANCH
diff options
context:
space:
mode:
authorNicolas Vigier <boklm@mageia.org>2011-01-11 00:35:59 +0000
committerNicolas Vigier <boklm@mageia.org>2011-01-11 00:35:59 +0000
commitad7fb7807ceaee96521d779993a5e1b28650723f (patch)
tree2ece42aa7e83b7fdb51702b298aa3eec95da3573 /BRANCH
parent715e125cc8d0b3fc4a79752e28a8b76a4ce97d5a (diff)
downloadmgarepo-ad7fb7807ceaee96521d779993a5e1b28650723f.tar
mgarepo-ad7fb7807ceaee96521d779993a5e1b28650723f.tar.gz
mgarepo-ad7fb7807ceaee96521d779993a5e1b28650723f.tar.bz2
mgarepo-ad7fb7807ceaee96521d779993a5e1b28650723f.tar.xz
mgarepo-ad7fb7807ceaee96521d779993a5e1b28650723f.zip
rename repsys to mgarepo, RepSys to MgaRepo, and update docs and examples for Mageia
Diffstat (limited to 'BRANCH')
-rw-r--r--BRANCH419
1 files changed, 0 insertions, 419 deletions
diff --git a/BRANCH b/BRANCH
deleted file mode 100644
index ebb87c7..0000000
--- a/BRANCH
+++ /dev/null
@@ -1,419 +0,0 @@
-================================
-The detached binaries repository
-================================
-
-.. contents::
-
-A brief description
-===================
-
-Ideally, all binaries from packages sources (ie. all the binary files inside
-SOURCES/) will be placed in another subversion repository. This repository
-is called "tarballs repository", "binaries repository" or just "binrepo".
-It will contain mostly the same directory structure of the main repository,
-but instead of having SOURCES and SPECS, it will only have a SOURCES
-directory. Every copy/move operation should happen in both repositories.
-
-In order to allow deceasing binaries from older distributions, each stable
-distro will have its own subversion repository for binary files. repsys
-knows how to access these binrepos by checking which URL defined in the
-"[binrepo]" section of the configuration file matches the path-part of the
-repository being accessed. (see open issues)
-
-The package changelogs will be generated from SVN commit logs in the main
-"plaintext" repository ("txtrepo" for short) only. Old changelogs will be
-preserved, as even empty revisions are preserved in the binaries-filtering
-conversion.
-
-
-Mapping repositories states
----------------------------
-
-In order to allow the use of `repsys {getsrpm,co} -r REV`, repsys will have
-to use a reference in the text repo which will be used to know in what
-state was the binrepo when a binary was uploaded.
-
-We cannot use direct revision number mapping through properties/files/etc
-mainly because we may have multiple binaries repositories, and eventually
-they can be filtered for reducing space, thus can't ensure revisions will
-survive. Thus another mechanism which relies on dates instead of revisions
-numbers is needed.
-
-When a binary is uploaded to the binrepo, the file `sha1.lst` is updated to
-have the files's hash and commited in the main text repo. This file will be
-used as the reference when the user uses -r REV on repsys. repsys will
-checkout the package in the main text repo with -r REV and then will use
-the "Last Changed Date" of `sha1.lst` to checkout the binrepo part. Thus,
-`sha1.lst` should be always commited to the main text repository *after* the
-corresponding binary files have been commited to the binrepo. Hooks in the
-main repository may be used to try to enforce this, by checking if the files
-changed in `sha1.lst` are already commited in the corresponding binrepo.
-
-Computation of `sha1.lst` is unlikely to be an issue:
-
-- it should not happen too often for any given package
-- it takes[0] less than 10s to sha1sum all SOURCES of openoffice.org-3.1-1mdv2010.0.src.rpm
-- it probably takes way less than the time to upload the file into the repository
-- it can be computed in parallel to the binrepo commit, and probably finish
- before that, thus ready by the time `sha1.lst` should be commited
-- users don't need to verify the SHA1s "everytime", but the build system
- does, thus Repsys can default to not verify and avoid wasting users' time
-
-The use of `sha1.lst` has the valuable property of tying the state of the main
-repository and the binrepo. With it, at getsrpm time of a package
-submission we can verify the SHA1 of the SOURCES-bin, and be sure that
-either the package will be built with the expected state, or early fail the
-build. It also allows for verifying binaries without trusting the binrepo,
-which may be useful if we consider using an unversioned plain filesystem
-storage in the future (for old distros or whatever), or at "client side",
-which maintainers may find useful.
-
-[0]: In a single core AMD Athlon(tm) 3800+ (2400Mhz)
-
-Mapping of revisions using SVN properties
------------------------------------------
-
-Alternatively to using the above "sha1.lst scheme", the revision mapping
-between the main repository and a binrepo could be done using subversion
-properties. This could be done by making every commit to binrepos also
-cause a corresponding commit in the main text repository to happen, which
-would update a property recording the current date. That is, a subversion
-property in the main text repository would be kept, such that for any given
-main repository revision, the corresponding state of the binrepos is
-obtainable (using the registered date).
-
-This would be "more transparent", as it can be maintened simply by using
-subversion hooks, without user intervention. OTOH, as every time the user
-commits to a binrepo this would result in a commit in the main repository,
-it would require the user to "svn up" the directories from there before
-commiting, after every binrepo commit. Also, this might result in a big
-number of "bogus" commits to the main repository, which could be seen as log
-pollution, and may potentially increase space usage etc..
-
-Why a new repository without the tarballs
-==========================================
-
-- the current svn repository is too large, hard to manage
-- big binary files (in general, "tarballs") history is of little value in
- the distro development, we care much more about our specs, patches,
- configurations, etc.; nonetheless, those big files we don't care much for
- take the most resources and make backups and restoration in case of
- failure very expensive, much more so than the more valuable data
-- there is no easy way to strip undesired tarballs without recreating the
- whole repository
-- fedora and ubuntu have separated repositories, so we must have it too!
-
-Numbers
--------
-
-Current repository is +390000 revisions and ~340Gb big, while the bzip2ed
-dumps backup for it takes about a bit more than half that size (FIXME:
-estimative, can't check in the backup server right now). Current txtrepo
-with the same number of revisions is ~180Gb big, takes about 2-3 days to be
-imported, while the gzipped full dump backup for it currently takes ~1.2Gb.
-Initial binrepo for Cooker (only `current/` packages' branches) took ~28Gb
-in disk, gzipped full dump for it takes ~25Gb, took about 5h30m to be
-populated from the current in use repository ("oldrepo").
-
-
-Drawbacks of this layout
-=========================
-
-- (always) everything that changes the single-repository usage increases the chance
- of failure and make things more complicated.
-- subversion can't be used alone as easily as the current scheme allows
-- copying binaries between distro branches may not be "svn-cheap" anymore
- (unless they're in the same binrepo)
-- ...
-
-
-Open issues
-============
-
-Multiple binrepos dont allow us to have one permanent URL
----------------------------------------------------------
-
-We would have to update the configuration files from all the users in order
-to add a new stable repository. spuk suggests to use properties in the main
-text repo that would point to the right repository locations.
-
-How to handle failures when operating on more repositores?
-----------------------------------------------------------
-
-binrepos should replicate the structure of the main text repo. What we
-should do if the markrelease succeeds in the binrepo, but fails in the main
-text repo?
-
-R: Markrelease must be done first in the txtrepo. If it fails there "we're
-in trouble" (though currently, we just miss it[0]). When the markrelease is
-done in the txtrepo, we can do markrelease in the binrepo using '-r {DATE}',
-using the markrelease date in the txtrepo as '{DATE}'.
-
-[0] We should add transaction support for markrelease. The transaction could
-be stored out of the packages SVN (another SVN, a DB, a txt file, etc.), and
-would work like:
-
-0. mark beginning of markrelease, early failing the package build if it fails
-1. do markrelease
-2. mark sucessful end of markrelease
- or mark failed markrelease, so we can replay it later
-
-
-Interesting use cases (first phase)
-===================================
-
-repsys co 2008.1/mutt
----------------------
-
-- repsys checkouts
- http://svn.mandriva.com/svn/packages/updates/2008.1/mutt/current to the
- mutt directory
-
-- repsys checkouts
- http://svn.mandriva.com/svn/binrepo/updates/2008.1/mutt/current/SOURCES
- into mutt/SOURCES-bin
-
-- creates symlinks for all files found in SOURCES-bin/ into ../SOURCES/
-
- (rpm doesn't handle symlinks, this allows us to have explicit links and
- proper src.rpm generates by rpmbuild)
-
-In case the path doesn't exist in the binrepo it will not fail, as we may
-have not imported all packages or the repository is not prepared to work on
-this model, etc.
-
-markrelease of a package
-------------------------
-
-::
-
- $ repsys markrelease
-
-- will copy current/ to releases/VERSION/RELEASE, as usual
-
-- will copy current/ to releases/, on the binrepo too
-
-Optionally, markrelease could create revprops indicating which is the
-revision of current/ on the binrepo that represents the tarballs that are
-being tagged.
-
-
-Use cases to be implemented after the first phase
-=================================================
-
-upgrading to a newer version of the package
--------------------------------------------
-
-::
-
- $ cd bla/SOURCES/
- $ wget http://prdownloads.sourceforge.net/bla/bla-1.6.tar.bz2
- $ repsys add bla-1.6.0.tar.bz2
-
-- repsys notices this is a tarball (checking filename and/or file size)
-
-- repsys will move the file to SOURCES-bin/, create the symlink, and svn-add
- it to the working copy
-
- $ # the user updates the spec
-
- $ repsys rm SOURCES/bla-1.5.1.tar.bz2
-
-- it will remove the symlink and run svn rm on
- SOURCES-bin/bla-1.6.0.tar.bz2::
-
- $ cd ../ # package top dir
- $ repsys ci
-
-- repsys will commit the new tarball on SOURCES-bin/ and then on the rest
- of the working copy
-
-repsys sync would perform these steps too.
-
-importing a package
--------------------
-
- $ repsys putsrpm mypkg.src.rpm
-
-- repsys will open the src.rpm
-
-- will look for tarballs inside SOURCES/ and import them to
- http://svn.mandriva.com/svn/binrepo/cooker/mypkg/current/SOURCES/
-
-- will move the tarballs out of SOURCES and import the remaining files to
- http://svn.mandriva.com/svn/packages/cooker/mypkg/current/
-
-- will do whatever else putsrpm already does
-
-TODO
-=====
-
-First phase
------------
-
-- upload
-- markrelease
-- putsrpm
-- getsrpm
-
-
-Second phase
-------------
-
-- up
-- sync
-
-Rejected or postponed ideas
-===========================
-
-Use of a plain filesystem storage for the tarballs
---------------------------------------------------
-
-This was planned, then rejected. It becomes too complicated when thinking
-about markrelease, and mapping SVN revisions in the main repository to
-binaries versions in the "tarballs storage", basically requiring
-implementing VCS-like features on top of filesystem. Would also require
-implementing another authentication and access scheme. The main feature
-would be ease of removing old binaries, which isn't much of a point because
-we don't know precisely what and when we want to remove, so may end up not
-removing much files anyway.
-
-Use of a plain unversioned filesystem storage for the tarballs
---------------------------------------------------------------
-
-Different than the previous one, this would mean not relying at all on
-binary files history keeping. Structure could be something simple like::
-
- packages/${pkg:0:1}/$pkg/$tarball
-
-This alternative does not suffice for Cooker, nor for supported distros, for
-which we want history. It could, however, at some point be used for "very
-old" distros, for which we may have lost interest in keeping *binaries*
-history (package history will kept "forever" in the main SVN repository).
-Alternatively, "resetting" an SVN binrepo (i.e. recreate the repository) to
-contain only the latest tarballs would probably take about the same amount
-of space, anyway...
-
-Open tarballs repository
-------------------------
-
-This idea is not really rejected. It does not go against splitting txtrepo
-and binrepo, but rather complement this idea, where the
-open-tarballs-repository would take the place of the binrepo. The txtrepo
-would still be used +- the same way. This repository could be used
-selectively, for packages where it makes sense, while most packages could be
-kept "closed", still as tarballs.
-
-Use of externals for more seamless Subversion usage
----------------------------------------------------
-
-This idea is not discarded, but it just provides easiness. OTOH, it makes
-things more complicated:
-
-- markrelease: externals would have to be updated in order to make it point
- to the tagged version in the binrepo, otherwise changes in
- current@binrepo would change older releases;
-- branching whole distro: even though subversion now supports "relative
- externals", we would have to update the URLs for *every* package on the
- distro, as the path to reach the binrepo spans the local distribution
- directory;
-- keeping externals up-to-date (as stated above and below)
-- authentication and access control: only markrelease action done by the
- build system should be allowed to change externals (so what about importing
- new packages?)
-- just a convenience, we don't need and shouldn't rely on externals for
- running the build system, while most people will use the repositories via
- Repsys, so why spend time to implement and keep it?
-- "svn co" works transparently, cool, but "svn co -r N" does not, otherwise
- every change in the binrepo would require svn:externals to be updated in
- the respective package;
-- it does not solve the problem of creating and handling symlinks between
- SOURCES and SOURCES-bin.
-
-Keeping svn:externals updated for every package has almost the same cost of
-keeping the `sha1.lst` updated, with the difference that in the latter we
-would not have to update every package when creating distro branches.
-
-Use of "external" xdelta to save space on binaries
---------------------------------------------------
-
-But how? First idea is this could be done by defining a protocol and
-assuming repository manipulation with repsys (for ease). Repsys could
-xdelta tarballs and add it to SVN with a special filename, then use it when
-checking out. Would require a policy/algorithm on when to ditch old whole
-binaries, too (i.e. hopefully wouldn't need to be handled manually by the
-maintainer). Also, this is something complemental to splitting the
-repository, so we may do it later, for binrepos.
-
-
-The Future
-==========
-
-- Open tarballs repositories
-
- - suited for GIT, maybe multi-VCS
- - incremental move
- - not everything will be suited for this, must handle all cases or be
- optional
-
-- Xdelta
-
-
-Deployment
-==========
-
-The current repository will be kept around for a while, in readonly state.
-Initial binrepos will be populated with the binaries in the `current/`
-branches of packages.
-
-The binrepo mappings config might be kept in a fixed subversion revision
-property (revision 0?).
-
-Rough steps
------------
-
-- check for agreement between subversion repository filters for binaries,
- and repsys
-- upgrade repsys everywhere
-
- - kenobi
- - cluster nodes
- - raoh
- - titan
-
-- populate the binrepos for each supported distro, from a specific revision
- of oldrepo, and mass commmit the corresponding `sha1.lst` in txtrepo for
- every package
-
- - set svn:date revprop of the `sha1.lst` mass commit to the date of the
- oldrepo revision
- - before mass commiting the `sha1.lst`, possibly freeze oldrepo, check
- for changes to sources after the selected revision, and update the
- binrepo as necessary
-
-- check Secteam scripts, make needed changes to get them ready (non
- critical)
-- set up the new repositories
-
- - hook for filtering of disallowed (binary) files in main repository
- - binrepos mappings
-
-- make the new main + binrepos repositories available, but readonly
-
- - keep new main repository in sync with the old repository with hooks
-
-- make current repository readonly and enable verification of sha1.lst at
- package submission time
-
-- make sure new main repository and old repository are in sync
-
- - resync binrepos with the old repository as needed
-
-- final tests
-
- - change something
- - submit
- - etc.
-
-- make the new repositories writeable
-