Difference between revisions of "Linux Tools"

From DUNE
Jump to navigation Jump to search
 
(143 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Intro=
+
=About this page=
This page is a collection of (hopefully) useful information and trivia
+
This page is a collection of various and often unrelated bits of information available elsewhere but kept
which may be required to build a Web service based on Django/Apache/PostgreSQL
+
here for quick reference and occasionally useful in building a functional system in the Linux environment.
and to manage a small pool of machines for testing purposes.
+
For ease of access, a lot of the information previously contained here has been factored out into separate
 +
articles accessible via the navigation sidebar on the left.
  
=Python=
+
=Remote Access and Execution=
At the time of writing the system version of Python is often 2.7, whereas newer applications
+
==Overview==
benefit from using Python 3.*. One way to deal with that is to include "env" in hashbang pointing
+
It is convenient to control a few machines from a single host. Typically ssh is used for this purpose,
to the exact version you want to use. Apache/WSGI deployments may require additional footwork
+
but if security is not a concern (e.g. then the network is strictly local) telnet can be also used as a quick solution.
to ensure the correct version of Python runtime is used in mod_wsgi etc.
+
It will also server to "bootstrap" ssh connectivity i.e. debug ssh configuration remotely to make it operational.
  
'''Debian "Alternatives"''' - Debian has a way to specify the default version of an app. For example, if more than one version of Python
+
Among advantages of ssh is X11 forwarding, which functionality telnet does not have.
is present on the system, the command "update-alternatives" can be used to activate any of the available choices.
 
  
'''Caution''' - it's not a good idea to switch from the version of Python which came with your distro, since there
+
==ssh==
documented and undocumented dependencies in various places, on that particular version. Random things may break
 
such as software update, applications like Dropbox etc. ''Caveat Emptor''.
 
  
Remove an alternative version:
+
<h4>Installation and keys</h4>
<pre>sudo update-alternatives --remove python /usr/bin/python3</pre>
+
You'll need to run the '''sshd''' service on every machine you want to connect to. On Linux, this is most frequently '''openssh-server''' and it can be trivially installed. Make sure there is a ssh entry in /etc/services, with the desired port number.
  
Example above allows to fall back on the previous version, such as Python 2.7.
+
To be used productively, private and public keys will need to be generated or imported as necessary. For the private/public key pair to work, public keys should be added to the file ".ssh/authorized_keys". A matching private key must be loaded to an identity managing service (e.g. ssh-agent in case of Linux) on the machine ''from'' which you are going to connect. If it's not cached, you will likely be prompted to enter the passphrase for the key.
  
It is recommended that instead of replacing the default, relevant scripts contain explicit reference to version 3+ if possible.
+
Typically (this depends on the flavor of your sshd) you will get a message specifying which public key is used during the login that you are attempting. This is useful to know if you have many keys and forget which was used for what connection.
  
=Django=
+
Restarting the service:
 +
<pre>sudo systemctl restart ssh</pre>
  
There are a few ways to install Django, perhaps the cleanest and easiest is by using ''pip''. With Python 3+
+
Adding a key to the agent:
you will need to install pip3 first, like
 
 
<pre>
 
<pre>
apt-get install python3-pip
+
eval "$(ssh-agent -s)"
 +
ssh-add key_file
 
</pre>
 
</pre>
  
After that, Django is obtained by
+
You can also check which keys are loaded
 +
<pre>ssh-add -l</pre>
 +
 
 +
In case of problems while connecting, it may be helpful to check the log on the ssh server machine: /var/log/auth.log.
 +
 
 +
 
 +
Gateways such as one operating at BNL and other Labs typically require that your public key would be uploaded and cached on their side in advance. The exact way this can be done is site-dependent. Some sites require to verify the upload by providing the public key's fingerprint. Example of how to get it:
 
<pre>
 
<pre>
pip3 install django==1.10
+
ssh-keygen -E md5 -lf my_public_key_file
 
</pre>
 
</pre>
...and other versions available instead of 1.10 can be specified if needed. An important and popular Django add-on package "tables2" can be added likewise:
+
 
 +
If you lost your public key (while still having your private one) you can re-create it:
 
<pre>
 
<pre>
pip3 install django-tables2
+
ssh-keygen -yf my_private_key_file
 
</pre>
 
</pre>
  
To check which version of Django you are using at the moment, start interactive Python and use this:
+
Once it's done, a connection becomes possible, for example:
 
<pre>
 
<pre>
import django
+
ssh username@atlasgw.usatlas.bnl.gov
django.VERSION
 
 
</pre>
 
</pre>
  
=Databases=
+
The '-X' option is needed to enable X11 forwarding in a connection established in this manner.
==Postgres==
 
===Installation===
 
<pre>
 
sudo apt-get update
 
sudo apt-get install postgresql postgresql-contrib
 
</pre>
 
  
You will likely need an additional package if using
+
<h4>Tunnels</h4>
PostgreSQL as the Django backend:
+
Using proxies at BNL:
<pre>
+
<pre>ssh -L 8080:130.199.23.54:3128 yourAccount@your.gateway.bnl.gov</pre>
sudo pip3 install psycopg2
 
</pre>
 
  
===Running===
+
The port 8080 is chosen as an example - it must be a number larger than a certain lower limit to satisfy a security policy. On your local machine, you would need to specify a proxy which looks like this:
If a restart of the DB engine is required:
 
 
<pre>
 
<pre>
sudo service postgresql restart
+
localhost:8080
 
</pre>
 
</pre>
  
===PSQL===
+
Another example when going from one Linux box to another:
====Log in/out on localhost====
 
The "psql" client requires a semicolon after each string you enter on
 
the command line. It won't report if it's missing and it's easy to forget.
 
 
 
Switch over to the postgres account on your server by typing:
 
 
<pre>
 
<pre>
sudo -i -u postgres
+
ssh -L 8000:localhost:8000 myRemoteHost
psql
 
 
</pre>
 
</pre>
  
Same without switching accounts (just switching for one session):
+
The above gives you access to the remote port 8000 on the local machine via localhost:8000. For example, this works for accessing a machine
 +
on the internal CERN netword via http:
 
<pre>
 
<pre>
sudo -u postgres psql
+
ssh -L 8008:neutdqm.cern.ch:8008 user@lxplus015.cern.ch
 
</pre>
 
</pre>
  
 +
If there is a need to access a HTTPS site, port number 443 needs to be forwarded. Forwarding to low-numbered ports (e.g. forwarding 443 remote to 443 local)
 +
will require sudo or root on most systems.
  
After having created a user and making sure authentication method
+
If there is a certificate issue it needs to be resolved either in the browser, or, if wget is used, by applying the --no-check-certificate option.
is set correctly in the configuration file (path may be system and version dependent
 
and named something like /etc/postgresql/9.5/main/pg_hba.conf), one can
 
connect to PostgreSQL not as the default "postgres" user but for example
 
as "p3s" or any other userID of choice:
 
<pre>
 
psql -U p3s -d tst
 
</pre>
 
  
The "-d" option is important because otherwise psql will assume a default
 
database name which may not in fact exist. In the above example, the "tst"
 
database was created beforehand by the user "postgres" to enable testing.
 
  
Example of getting help:
+
<h4>Password Automation</h4>
 +
There are a few cases when key-based auth is not suitable and one has to use passwords with ssh. To automate logging in one may choose to install and use the "sshpass" utility, provided the credentials you supply are not stored in the open. To force the password authentication method instead of the public key this option can be used:
 
<pre>
 
<pre>
testdb=# \h create table
+
-o PubkeyAuthentication=no
 
</pre>
 
</pre>
  
 +
<h4>Windows clients</h4>
 +
Once in a while you may need to use a Windows client to connect to various services via ssh. In Windows 10 there is a variation of steps to get the ssh client(s) operational depending on the software release. The more recent updates (as of Spring 2019) have OpenSSH installed under Windows\System32\OpenSSH, with the usual complement of tools.
  
Exit out of the PostgreSQL prompt by typing: \q
+
==telnet==
 +
While using ssh is in general preferable for many reasons and foremost due to security concerns, sometimes there is a chicken and an egg problem where
 +
you need to establish access fast in order to debug ssh on a remote machine. In these cases, and if security
 +
is not a concern (rare, but could happen on an entirely internal network), one may opt to use telnet.
  
====Remote Access====
+
On Ubuntu one can install the software necessary to run the telnet service in the following manner:
Add or edit the following line in your postgresql.conf, in order to enable access
 
from any host (edit accordingly for more selective access rights):
 
 
<pre>
 
<pre>
listen_addresses = '*'
+
sudo apt-get install xinetd telnetd
 
</pre>
 
</pre>
  
To enable authentication from remote hosts for user "foo", edit pg_hba.conf to add
+
Make sure there is an entry in /etc/services which looks like
 
<pre>
 
<pre>
host    all            foo            0.0.0.0/0              md5
+
telnet        23/tcp
 
</pre>
 
</pre>
  
====Users====
+
Also, create a file /etc/xinetd.d/telnet with contents similar to this:
Create a user/role:
 
 
<pre>
 
<pre>
createuser --interactive
+
service telnet {   
 +
        disable        = no
 +
        flags          = REUSE
 +
        socket_type    = stream
 +
        wait            = no
 +
        user            = root
 +
        server          = /usr/sbin/in.telnetd
 +
        log_on_failure  += USERID HOST
 +
        log_on_success  += PID HOST EXIT
 +
        log_type        = FILE /var/log/xinetd.log
 +
}
 
</pre>
 
</pre>
  
Another example:
+
...and start the service as follows:
 
<pre>
 
<pre>
create user FOO with SUPERUSER
+
sudo /etc/init.d/xinetd start
 
</pre>
 
</pre>
  
===Databases and Tables===
+
==pdsh==
====Creation of DB====
+
This is an advanced parallel shell designed for cluster management. It often uses ssh as the underlying protocol although there are other options as well. Configuration is defined by files residing in /etc/pdsh. For example, the file "machines" needs to contain the list of computers to be targeted by pdsh. Optionally, this is also the place for a file that can be sourced for convenience of setup, cf
From the OS prompt:
 
 
<pre>
 
<pre>
sudo -u postgres createdb foo
+
# setup pdsh for cluster users
 +
export PDSH_RCMD_TYPE='ssh'
 +
export WCOLL='/etc/pdsh/machines'
 
</pre>
 
</pre>
  
Also can be done from within psql.
+
This of course can be done from the command line anyway, cf
 
<pre>
 
<pre>
postgres=# create database testdb;
+
export PDSH_RCMD_TYPE=ssh
 
</pre>
 
</pre>
  
====Tables====
+
Using ssh as the underlying protocol for pdsh implies that you have set up private and public keys just like you normally would for ordinary ssh login.
 +
Once this is done, you should be able to do something like this as a basic test of your setup:
 
<pre>
 
<pre>
testdb=# create table people (
+
pdsh -w targetHost "ls"
testdb(# name char(50) primary key not null,
 
testdb(# age int not null
 
testdb(# );
 
 
</pre>
 
</pre>
  
Changing a table:
+
If the targetHost is omitted, the command will be run against all machines listed in the "machines" file as explained above. Should a command fail on a particular machine, this will be indicated (with an error code) in the output of the command, with the name of the machine listed. Redirection of stderr with something like "2>/dev/null" included with the command you run won't work with pdsh.
 +
 
 +
Example of installation on CentOS:
 
<pre>
 
<pre>
ALTER TABLE foo ADD last_maint date;
+
yum install pdsh
 
</pre>
 
</pre>
  
<!-- In Ubuntu, if you need to add a few applications to your desktop, this can be done as follows:
+
==curl==
sudo cp /usr/share/applications/firefox.desktop  ~/Desktop/
 
sudo chmod +x ~/Desktop/firefox.desktop -->
 
  
====Info====
+
To post a form:
List of DBs:
 
 
<pre>
 
<pre>
\l
+
curl -X POST -F 'username=minime' -F 'password=something' http://blah.com
 +
curl -X POST -F 'username=minime'  -H "Content-Type: application/x-www-form-urlencoded" http://blah.com
 
</pre>
 
</pre>
  
List of schemas:
+
=Miscellania=
 +
 
 +
==Linux Version and Distribution==
 +
 
 
<pre>
 
<pre>
\d
+
cat /etc/os-release
 +
lsb_release -a
 +
hostnamectl
 +
# Linux kernel version:
 +
uname -r
 
</pre>
 
</pre>
  
 +
This seems to work reliably:
 +
<pre>
 +
cat /proc/version
 +
</pre>
  
=Apache=
+
Also,
==Installation==
 
On Ubuntu:
 
 
<pre>
 
<pre>
sudo apt-get install apache2
+
cat /etc/*release
 +
# or
 +
cat /etc/issue*
 +
# or
 +
cat /proc/version
 
</pre>
 
</pre>
  
==Start-Stop-Restart==
+
==Linux User Management==
===Ubuntu===
+
 
To start/stop/restart Apache 2 web server, enter one of the commands in each category:
+
https://www.digitalocean.com/community/tutorials/how-to-create-a-sudo-user-on-centos-quickstart
 +
 
 
<pre>
 
<pre>
### START
+
adduser username
/etc/init.d/apache2 start
+
passwd username
sudo /etc/init.d/apache2 start
 
sudo service apache2 start
 
### STOP
 
/etc/init.d/apache2 stop
 
sudo /etc/init.d/apache2 stop
 
sudo service apache2 stop
 
### RESTART
 
/etc/init.d/apache2 restart
 
sudo /etc/init.d/apache2 restart
 
sudo service apache2 restart
 
 
</pre>
 
</pre>
  
System status:
+
Use the usermod command to add the user to the wheel group.
 +
 
 
<pre>
 
<pre>
systemctl status apache2.service
+
usermod -aG wheel username
 
</pre>
 
</pre>
  
 +
By default, on CentOS, members of the ''wheel'' group have sudo privileges.
  
===CentOS/RH===
+
==Network==
On RedHat Linux, the name of the daemon is httpd.
+
"nslookup" is a useful network information utility with diverse functionality. One simple function is to translate qualified host names to IP addresses and back.
Also, "service" command may be aliased to systemctl.
+
 
 +
"sha" headers one may need while installing xrootd can be obtained by running (on Ubuntu):
 
<pre>
 
<pre>
systemctl status -l httpd.service
+
sudo apt-get install libssl-dev
 +
</pre>
 +
...or as follows on CentOS
 +
<pre>
 +
sudo yum install openssl openssl-devel
 
</pre>
 
</pre>
  
==Apache Configuration==
+
libssl may be necessary also for installation of pip3 etc.
===General Items===
 
KeepAlive sets the tradeoff between memory and CPU usage by Apache.
 
  
Serving static files:
+
A few other dependencies of xrootd can be met by installing glib2.0.
https://docs.djangoproject.com/en/1.10/howto/deployment/wsgi/modwsgi/#serving-files
 
  
===Official Layout of the Config Files===
+
In case the network connection becomes stale, on Ubuntu:
https://wiki.apache.org/httpd/DistrosDefaultLayout
 
This, however, is not written in stone.
 
===Ubuntu===
 
 
<pre>
 
<pre>
/etc/apache2/apache2.conf
+
sudo service network-manager restart
 
</pre>
 
</pre>
  
==Deploying Django==
+
An extremely useful command (at least on Ubuntu) - lists IPs, DNSs etc:
===mod_wsgi===
+
<pre>
* When using mod_wsgi one has to make sure the version matches the Python version, this needs to be specified when mod_wsgi is installed
+
nmcli device show
* https://www.sitepoint.com/deploying-a-django-app-with-mod_wsgi-on-ubuntu-14-04/
+
</pre>
* Methods of setting up the environment for wsgi described in the current Django documentation may or may not work on a particular installation of Apache due to a few bugs and general complexity of *.conf and related files
 
  
====Ubuntu Example====
+
To see what process is listening on a given port:
Snippet from 000-default.conf on Ubuntu:
 
 
<pre>
 
<pre>
        ServerName promptproc
+
lsof -i :8000
        ServerAlias promptproc
+
</pre>
  
 +
==Shell==
  
        WSGIScriptAlias / /home/maxim/projects/p3s/promptproc/promptproc/wsgi.py
+
White space when using "sed":
 +
<pre>
 +
$ sed -e "s/\s\{3,\}/ /g" inputFile
 +
will substitute every sequence of at least 3 whitespaces with two spaces.
 +
</pre>
  
        Alias /static/ /var/www/static/
+
Produce a convenient timestamp for various uses:
        <Directory /var/www/static>
+
<pre>
        Require all granted
+
date -d "today" +"%Y%m%d%H%M"
        </Directory>
+
</pre>
  
        <Directory /home/maxim/projects/p3s/promptproc/promptproc>
+
To get timestamps in history:
        <Files wsgi.py>
+
<pre>
        Require all granted
+
HISTTIMEFORMAT="%d/%m/%y %T "
        </Files>
+
</pre>
        </Directory>
 
  
 +
"find"
 +
<pre>
 +
find . -maxdepth 1 -mmin +400
 
</pre>
 
</pre>
The "static directory must contain static content such as themes for the tables2 package.
 
Keep in mind that while this is served automatically by the Django development server,
 
it's not the case under Apache.
 
  
 +
'mmin' means it accepts minutes, 'mtime' days.
  
The file wsgi.conf needs to contain a reference to Python runtime like:
+
Find and recurcively delete directories modified more than 5 hours ago:
 
<pre>
 
<pre>
WSGIPythonPath /home/maxim/.local/lib/python3.5/site-packages
+
find . -maxdepth 1 -mindepth 1 -mmin +300 -exec rm -fr {} \;
 
</pre>
 
</pre>
  
===Database Deployment===
+
If you don't specify 'mindepth', the current directory will show up in the results and will be deleted in the case presented above.
====Permissions====
+
 
Assuming you are using sqlite, the file permissions on the DB file do matter if when you deploy under Apache.
+
Find files modified in a particular date:
So you either need to set wide permissions (may not be a good idea depending on the security situation) or
+
<pre>
change the owner to "www-data" (on Ubuntu) or "apache" (on CentOS). Other OS may require similar tweaks.
+
find . -type f -newermt 2018-04-11 ! -newermt 2018-04-12 -exec ls -l {} \;
 +
</pre>
  
====PostgreSQL====
+
Alternatively, this will find files between the two dates & times
An example of the "settings.py" clause:
 
 
<pre>
 
<pre>
DATABASES = {
+
touch -t 0810010000 /tmp/t1
    'default': {
+
touch -t 0810011000 /tmp/t2
        'ENGINE': 'django.db.backends.postgresql',
+
 
        'NAME': 'foo',
+
find / -newer /tmp/t1 -and -not -newer /tmp/t2
        'USER': 'bar',
 
        'PASSWORD': '***',
 
        'HOST': '',
 
        'PORT': '',
 
    }
 
}
 
 
</pre>
 
</pre>
  
=Misc Tools=
+
"cksum" - calculates CRC and byte count.
==ssh, telnet and other access methods==
+
 
It is convenient to control a few machines from a single host. Typically ssh is used for this purpose,
+
Remove line breaks from a file:
but if security is not a concern (e.g. then the network is strictly local) telnet can be also used as a quick solution.
+
<pre>echo $(cat $1)</pre>
It will also server to "bootstrap" ssh connectivity i.e. debug ssh configuration remotely to make it operational.
+
 
 +
Redirect stdout to one file and stderr to another file: <pre>command > out 2>error</pre>
  
Among advantages of ssh is X11 forwarding, which functionality telnet does not have.
+
Redirect stderr to stdout (&1), and then redirect stdout to a file:<pre>command >out 2>&1</pre>
===nslookup===
 
This is a very useful network information utility with diverse functionality. One simple function is to translate qualified host names to IP addresses and back.
 
  
===ssh===
+
Redirect both to a file:<pre>command &> out</pre>
You'll need to run the '''sshd''' service on every machine you want to connect to. On Linux, this is most frequently '''openssh-server''' and it can be trivially installed. Make sure there is a ssh entry in /etc/services, with the desired port number.
 
  
To be used productively, private and public keys will need to be generated or imported as necessary. For the private/public key pair to work, public keys should be added to the file ".ssh/authorized_keys". A matching private key must be loaded to an identity managing service (e.g. ssh-agent in case of Linux) on the machine ''from'' which you are going to connect. If it's not cached, you will likely be prompted to enter the passphrase for the key.
 
  
Typically (this depends on the flavor of your sshd) you will get a message specifying which public key is used during the login that you are attempting. This is useful to know if you have many keys and forget which was used for what connection.
+
Find the name of the file, minus the complete path:
 +
<pre>
 +
f=$(basename /home/maxim/JOB.html)
 +
echo $f
 +
</pre>
  
Restarting the service:
+
==SUDO==
 +
To change the password prompt timeout for sudo, you will need to run the command ''sudo visudo'' (which is the way to safely edit the ''sudoers'' file) and modify the following line by adding the timeout clause set to the desired number of minutes:
 
<pre>
 
<pre>
sudo systemctl restart ssh
+
Defaults        env_reset, timestamp_timeout=XX
 
</pre>
 
</pre>
  
Adding a key to the agent:
+
==Crontab==
 +
*    minute (from 0 to 59)
 +
*    hour (from 0 to 23)
 +
*    day of month (from 1 to 31)
 +
*    month (from 1 to 12)
 +
*    day of week (from 0 to 6) (0=Sunday)
 +
 
 +
 
 
<pre>
 
<pre>
eval "$(ssh-agent -s)"
+
crontab -r # clear out your crontab
ssh-add key_file
+
crontab -l # list your crontab
 
</pre>
 
</pre>
  
Gateways such as one operating at BNL and other Labs typically require that your public key would be uploaded and cached on their side in advance. The exact way this can be done is site-dependent. Some sites require to verify the upload by providing the public key's fingerprint. Example of how to get it:
+
==Checksum==
 
<pre>
 
<pre>
ssh-keygen -E md5 -lf my_public_key_file
+
xrdadler32
 
</pre>
 
</pre>
 +
==CVMFS==
 +
 +
https://cernvm.cern.ch/portal/filesystem/downloads
  
If you lost your public key (while still having your private one) you can re-create it:
 
 
<pre>
 
<pre>
ssh-keygen -yf my_private_key_file
+
sudo apt-get install cvmfs cvmfs-config-default
 +
https://cernvm.cern.ch/portal/filesystem/quickstart
 
</pre>
 
</pre>
  
Once it's done, a connection becomes possible, for example:
+
==Encrypt a directory==
 
<pre>
 
<pre>
ssh username@atlasgw.usatlas.bnl.gov
+
tar cz myDir/ | mcrypt -k myPassword > myDir.z.nc
 
</pre>
 
</pre>
  
The '-X' option is needed to enable X11 forwarding in a connection established in this manner.
+
=Version Control (git)=
 +
[[ Git ]]
  
Tunneling at BNL:
+
==Starting out==
 +
Notify git of your identity and ID:
 
<pre>
 
<pre>
ssh -L 8080:130.199.23.54:3128 yourAccount@your.gateway.bnl.gov
+
git config --global user.email "yourname@yoursite.yourdomain"
 +
git config --global user.name yourID
 
</pre>
 
</pre>
  
The port 8080 is chosen as an example - by rules it must be a number larger than a certain low limit. On your local machine, you would need to specify a proxy which looks like this:
+
Pick a better editor for commit messages:
 
<pre>
 
<pre>
localhost:8080
+
git config --global core.editor "nano"
 
</pre>
 
</pre>
  
Another example when going from one Linux box to another:
+
To avoid entering git userID and password:
 +
<pre>git config --global credential.helper 'cache --timeout 7200'</pre>
 +
 
 +
To address the usual "^M" problem when switching between Linux and Windows environments
 
<pre>
 
<pre>
ssh -L 8000:localhost:8000 myRemoteHost
+
$ git config --global core.autocrlf true
 +
# Remove everything from the index
 +
$ git rm --cached -r .
 +
 
 +
# Re-add all the deleted files to the index
 +
# You should get lots of messages like: "warning: CRLF will be replaced by LF in <file>."
 +
$ git diff --cached --name-only -z | xargs -0 git add
 +
 
 +
# Commit
 +
$ git commit -m "Fix CRLF"
 
</pre>
 
</pre>
  
The above gives you access to the remote port 8000 on the local machine via localhost:8000.
+
(Also see https://stackoverflow.com/questions/1889559/git-diff-to-ignore-m)
  
===telnet===
+
==Restoring Files==
While using ssh is in general preferable for many reasons and foremost due to security concerns, sometimes there is a chicken and an egg problem where
+
First, see this link:
you need to establish access fast in order to debug ssh on a remote machine. In these cases, and if security
+
 
is not a concern (rare, but could happen on an entirely internal network), one may opt to use telnet.
+
https://stackoverflow.com/questions/953481/find-and-restore-a-deleted-file-in-a-git-repository
  
On Ubuntu one can install the software necessary to run the telnet service in the following manner:
+
A recipe that may work well:
 
<pre>
 
<pre>
sudo apt-get install xinetd telnetd
+
git log --diff-filter=D --summary # finds deleted files
 +
git checkout $commit~1 filename # where "$commit" stands for the actual commit name (a long string)
 
</pre>
 
</pre>
 +
In the above, it's best to operate from the top level directory of the project and use path relative to that.
 +
Also, you may want to "git add" the restored files and commit them to make it permanent.
  
Make sure there is an entry in /etc/services which looks like
+
If you want to get a specific previous revision of a file, just capture the stdout of the following command:
 
<pre>
 
<pre>
telnet        23/tcp
+
git show $REV:$FILE
 
</pre>
 
</pre>
 +
...and rename the output as you see fit.
  
Also, create a file /etc/xinetd.d/telnet with contents similar to this:
+
==Undoing a commit==
 +
See:
 +
 
 +
https://sethrobertson.github.io/GitFixUm/fixup.html
 +
 
 +
If you want to reverse your latest commit to the HEAD:
 
<pre>
 
<pre>
service telnet {   
+
git reset --hard HEAD
        disable        = no
 
        flags          = REUSE
 
        socket_type    = stream
 
        wait            = no
 
        user            = root
 
        server          = /usr/sbin/in.telnetd
 
        log_on_failure  += USERID HOST
 
        log_on_success  += PID HOST EXIT
 
        log_type        = FILE /var/log/xinetd.log
 
}
 
 
</pre>
 
</pre>
  
...and start the service as follows:
+
To remove two or one last commits:
 
<pre>
 
<pre>
sudo /etc/init.d/xinetd start
+
git reset --hard HEAD~2
 +
git reset --hard HEAD~1
 
</pre>
 
</pre>
  
===pdsh===
+
==gitHub quirks ==
This is an advanced parallel shell designed for cluster management. It often uses ssh as the underlying protocol although there are other options as well. Configuration is defined by files residing in /etc/pdsh. For example, the file "machines" needs to contain the list of computers to be targeted by pdsh. Optionally, this is also the place for a file that can be sourced for convenience of setup, cf
+
Sometimes a cloned repo will end up in a state where you can't push local content. Things you might want to try this:
 
<pre>
 
<pre>
# setup pdsh for cluster users
+
git remote set-url origin https://myNameOnGithub@github.com/DUNE/dqmconfig.git
export PDSH_RCMD_TYPE='ssh'
 
export WCOLL='/etc/pdsh/machines'
 
 
</pre>
 
</pre>
  
This of course can be done from the command line anyway, cf
+
And in case it was not annoying enough, if you see something like "can't open display"
 +
this may help:
 
<pre>
 
<pre>
export PDSH_RCMD_TYPE=ssh
+
unset SSH_ASKPASS
 
</pre>
 
</pre>
  
Using ssh as the underlying protocol for pdsh implies that you have set up private and public keys just like you normally would for ordinary ssh login.
+
==Empty Commit==
Once this is done, you should be able to do something like this as a basic test of your setup:
+
When you need to trigger an action on GitHub or in other similar situation the following
 +
"empty commit" can be used (and then pushed):
 
<pre>
 
<pre>
pdsh -w targetHost "ls"
+
git commit -m 'rebuild pages' --allow-empty
 
</pre>
 
</pre>
  
If the targetHost is omitted, the command will be run against all machines listed in the "machines" file as explained above. Should a command fail on a particular machine, this will be indicated (with an error code) in the output of the command, with the name of the machine listed. Redirection of stderr with something like "2>/dev/null" included with the command you run won't work with pdsh.
+
=LaTeX=
 
 
==Version Control==
 
Notify git of your identity:
 
<pre>git config --global user.email "yourname@yoursite.yourdomain"</pre>
 
 
 
To avoid entering git userID and password:
 
<pre>git config --global credential.helper 'cache --timeout 7200'</pre>
 
 
 
==LaTeX==
 
 
One can choose to install all of tex packages or just a few:
 
One can choose to install all of tex packages or just a few:
 
<pre>
 
<pre>
Line 420: Line 435:
 
config files still around ("dpkg --purge" or "apt-get remove --purge"
 
config files still around ("dpkg --purge" or "apt-get remove --purge"
 
gets rid of the "rc" but they are just harmless cruft).
 
gets rid of the "rc" but they are just harmless cruft).
 +
 +
=Setting the environment for HTCondor=
 +
 +
It is often desirable to dynamically modify the content of the condor submit file (typically
 +
having the JDL extension). While it does not appear possible to access the shell environment
 +
variables within the submit file directly, a similar effect can be obtained by setting
 +
the internal HTCondor parameters on the command line, cf:
 +
<pre>
 +
condor_submit A=100 foo.jdl
 +
</pre>
 +
Then, one can access the value of "A" within the JDL file as $(A).
 +
 +
To find a number of idle jobs:
 +
<pre>
 +
/usr/bin/condor_q 2>&1| tail -1 | cut -d' ' -f 7
 +
</pre>

Latest revision as of 23:31, 30 November 2020

About this page

This page is a collection of various and often unrelated bits of information available elsewhere but kept here for quick reference and occasionally useful in building a functional system in the Linux environment. For ease of access, a lot of the information previously contained here has been factored out into separate articles accessible via the navigation sidebar on the left.

Remote Access and Execution

Overview

It is convenient to control a few machines from a single host. Typically ssh is used for this purpose, but if security is not a concern (e.g. then the network is strictly local) telnet can be also used as a quick solution. It will also server to "bootstrap" ssh connectivity i.e. debug ssh configuration remotely to make it operational.

Among advantages of ssh is X11 forwarding, which functionality telnet does not have.

ssh

Installation and keys

You'll need to run the sshd service on every machine you want to connect to. On Linux, this is most frequently openssh-server and it can be trivially installed. Make sure there is a ssh entry in /etc/services, with the desired port number.

To be used productively, private and public keys will need to be generated or imported as necessary. For the private/public key pair to work, public keys should be added to the file ".ssh/authorized_keys". A matching private key must be loaded to an identity managing service (e.g. ssh-agent in case of Linux) on the machine from which you are going to connect. If it's not cached, you will likely be prompted to enter the passphrase for the key.

Typically (this depends on the flavor of your sshd) you will get a message specifying which public key is used during the login that you are attempting. This is useful to know if you have many keys and forget which was used for what connection.

Restarting the service:

sudo systemctl restart ssh

Adding a key to the agent:

eval "$(ssh-agent -s)"
ssh-add key_file

You can also check which keys are loaded

ssh-add -l

In case of problems while connecting, it may be helpful to check the log on the ssh server machine: /var/log/auth.log.


Gateways such as one operating at BNL and other Labs typically require that your public key would be uploaded and cached on their side in advance. The exact way this can be done is site-dependent. Some sites require to verify the upload by providing the public key's fingerprint. Example of how to get it:

ssh-keygen -E md5 -lf my_public_key_file

If you lost your public key (while still having your private one) you can re-create it:

ssh-keygen -yf my_private_key_file

Once it's done, a connection becomes possible, for example:

ssh username@atlasgw.usatlas.bnl.gov

The '-X' option is needed to enable X11 forwarding in a connection established in this manner.

Tunnels

Using proxies at BNL:

ssh -L 8080:130.199.23.54:3128 yourAccount@your.gateway.bnl.gov

The port 8080 is chosen as an example - it must be a number larger than a certain lower limit to satisfy a security policy. On your local machine, you would need to specify a proxy which looks like this:

localhost:8080

Another example when going from one Linux box to another:

ssh -L 8000:localhost:8000 myRemoteHost

The above gives you access to the remote port 8000 on the local machine via localhost:8000. For example, this works for accessing a machine on the internal CERN netword via http:

ssh -L 8008:neutdqm.cern.ch:8008 user@lxplus015.cern.ch

If there is a need to access a HTTPS site, port number 443 needs to be forwarded. Forwarding to low-numbered ports (e.g. forwarding 443 remote to 443 local) will require sudo or root on most systems.

If there is a certificate issue it needs to be resolved either in the browser, or, if wget is used, by applying the --no-check-certificate option.


Password Automation

There are a few cases when key-based auth is not suitable and one has to use passwords with ssh. To automate logging in one may choose to install and use the "sshpass" utility, provided the credentials you supply are not stored in the open. To force the password authentication method instead of the public key this option can be used:

-o PubkeyAuthentication=no

Windows clients

Once in a while you may need to use a Windows client to connect to various services via ssh. In Windows 10 there is a variation of steps to get the ssh client(s) operational depending on the software release. The more recent updates (as of Spring 2019) have OpenSSH installed under Windows\System32\OpenSSH, with the usual complement of tools.

telnet

While using ssh is in general preferable for many reasons and foremost due to security concerns, sometimes there is a chicken and an egg problem where you need to establish access fast in order to debug ssh on a remote machine. In these cases, and if security is not a concern (rare, but could happen on an entirely internal network), one may opt to use telnet.

On Ubuntu one can install the software necessary to run the telnet service in the following manner:

sudo apt-get install xinetd telnetd

Make sure there is an entry in /etc/services which looks like

telnet        23/tcp

Also, create a file /etc/xinetd.d/telnet with contents similar to this:

service telnet {    
        disable         = no
        flags           = REUSE
        socket_type     = stream
        wait            = no
        user            = root
        server          = /usr/sbin/in.telnetd
        log_on_failure  += USERID HOST
        log_on_success  += PID HOST EXIT
        log_type        = FILE /var/log/xinetd.log
}

...and start the service as follows:

sudo /etc/init.d/xinetd start

pdsh

This is an advanced parallel shell designed for cluster management. It often uses ssh as the underlying protocol although there are other options as well. Configuration is defined by files residing in /etc/pdsh. For example, the file "machines" needs to contain the list of computers to be targeted by pdsh. Optionally, this is also the place for a file that can be sourced for convenience of setup, cf

# setup pdsh for cluster users
export PDSH_RCMD_TYPE='ssh'
export WCOLL='/etc/pdsh/machines'

This of course can be done from the command line anyway, cf

export PDSH_RCMD_TYPE=ssh

Using ssh as the underlying protocol for pdsh implies that you have set up private and public keys just like you normally would for ordinary ssh login. Once this is done, you should be able to do something like this as a basic test of your setup:

pdsh -w targetHost "ls"

If the targetHost is omitted, the command will be run against all machines listed in the "machines" file as explained above. Should a command fail on a particular machine, this will be indicated (with an error code) in the output of the command, with the name of the machine listed. Redirection of stderr with something like "2>/dev/null" included with the command you run won't work with pdsh.

Example of installation on CentOS:

yum install pdsh

curl

To post a form:

curl -X POST -F 'username=minime' -F 'password=something' http://blah.com
curl -X POST -F 'username=minime'  -H "Content-Type: application/x-www-form-urlencoded" http://blah.com

Miscellania

Linux Version and Distribution

cat /etc/os-release
lsb_release -a
hostnamectl
# Linux kernel version:
uname -r

This seems to work reliably:

cat /proc/version

Also,

cat /etc/*release
# or
cat /etc/issue*
# or
cat /proc/version

Linux User Management

https://www.digitalocean.com/community/tutorials/how-to-create-a-sudo-user-on-centos-quickstart

adduser username
passwd username

Use the usermod command to add the user to the wheel group.

usermod -aG wheel username

By default, on CentOS, members of the wheel group have sudo privileges.

Network

"nslookup" is a useful network information utility with diverse functionality. One simple function is to translate qualified host names to IP addresses and back.

"sha" headers one may need while installing xrootd can be obtained by running (on Ubuntu):

sudo apt-get install libssl-dev

...or as follows on CentOS

sudo yum install openssl openssl-devel

libssl may be necessary also for installation of pip3 etc.

A few other dependencies of xrootd can be met by installing glib2.0.

In case the network connection becomes stale, on Ubuntu:

sudo service network-manager restart

An extremely useful command (at least on Ubuntu) - lists IPs, DNSs etc:

nmcli device show

To see what process is listening on a given port:

lsof -i :8000

Shell

White space when using "sed":

$ sed -e "s/\s\{3,\}/  /g" inputFile
will substitute every sequence of at least 3 whitespaces with two spaces.

Produce a convenient timestamp for various uses:

date -d "today" +"%Y%m%d%H%M"

To get timestamps in history:

HISTTIMEFORMAT="%d/%m/%y %T "

"find"

find . -maxdepth 1 -mmin +400

'mmin' means it accepts minutes, 'mtime' days.

Find and recurcively delete directories modified more than 5 hours ago:

find . -maxdepth 1 -mindepth 1 -mmin +300 -exec rm -fr {} \;

If you don't specify 'mindepth', the current directory will show up in the results and will be deleted in the case presented above.

Find files modified in a particular date:

find . -type f -newermt 2018-04-11 ! -newermt 2018-04-12 -exec ls -l {} \;

Alternatively, this will find files between the two dates & times

touch -t 0810010000 /tmp/t1
touch -t 0810011000 /tmp/t2

find / -newer /tmp/t1 -and -not -newer /tmp/t2

"cksum" - calculates CRC and byte count.

Remove line breaks from a file:

echo $(cat $1)

Redirect stdout to one file and stderr to another file:

command > out 2>error

Redirect stderr to stdout (&1), and then redirect stdout to a file:

command >out 2>&1

Redirect both to a file:

command &> out


Find the name of the file, minus the complete path:

f=$(basename /home/maxim/JOB.html)
echo $f

SUDO

To change the password prompt timeout for sudo, you will need to run the command sudo visudo (which is the way to safely edit the sudoers file) and modify the following line by adding the timeout clause set to the desired number of minutes:

Defaults        env_reset, timestamp_timeout=XX

Crontab

  • minute (from 0 to 59)
  • hour (from 0 to 23)
  • day of month (from 1 to 31)
  • month (from 1 to 12)
  • day of week (from 0 to 6) (0=Sunday)


crontab -r # clear out your crontab
crontab -l # list your crontab

Checksum

xrdadler32

CVMFS

https://cernvm.cern.ch/portal/filesystem/downloads

sudo apt-get install cvmfs cvmfs-config-default
https://cernvm.cern.ch/portal/filesystem/quickstart

Encrypt a directory

tar cz myDir/ | mcrypt -k myPassword > myDir.z.nc

Version Control (git)

Git

Starting out

Notify git of your identity and ID:

git config --global user.email "yourname@yoursite.yourdomain"
git config --global user.name yourID

Pick a better editor for commit messages:

git config --global core.editor "nano"

To avoid entering git userID and password:

git config --global credential.helper 'cache --timeout 7200'

To address the usual "^M" problem when switching between Linux and Windows environments

$ git config --global core.autocrlf true
# Remove everything from the index
$ git rm --cached -r .

# Re-add all the deleted files to the index
# You should get lots of messages like: "warning: CRLF will be replaced by LF in <file>."
$ git diff --cached --name-only -z | xargs -0 git add

# Commit
$ git commit -m "Fix CRLF"

(Also see https://stackoverflow.com/questions/1889559/git-diff-to-ignore-m)

Restoring Files

First, see this link:

https://stackoverflow.com/questions/953481/find-and-restore-a-deleted-file-in-a-git-repository

A recipe that may work well:

git log --diff-filter=D --summary # finds deleted files
git checkout $commit~1 filename # where "$commit" stands for the actual commit name (a long string)

In the above, it's best to operate from the top level directory of the project and use path relative to that. Also, you may want to "git add" the restored files and commit them to make it permanent.

If you want to get a specific previous revision of a file, just capture the stdout of the following command:

git show $REV:$FILE

...and rename the output as you see fit.

Undoing a commit

See:

https://sethrobertson.github.io/GitFixUm/fixup.html

If you want to reverse your latest commit to the HEAD:

git reset --hard HEAD

To remove two or one last commits:

git reset --hard HEAD~2
git reset --hard HEAD~1

gitHub quirks

Sometimes a cloned repo will end up in a state where you can't push local content. Things you might want to try this:

git remote set-url origin https://myNameOnGithub@github.com/DUNE/dqmconfig.git

And in case it was not annoying enough, if you see something like "can't open display" this may help:

unset SSH_ASKPASS

Empty Commit

When you need to trigger an action on GitHub or in other similar situation the following "empty commit" can be used (and then pushed):

git commit -m 'rebuild pages' --allow-empty

LaTeX

One can choose to install all of tex packages or just a few:

apt install texlive texlive-humanities texlive-science

To see what is installed

dpkg -l

The little two-leter code at the front of each line says the status of the package. "ii" means installed and "rc" means removed but with config files still around ("dpkg --purge" or "apt-get remove --purge" gets rid of the "rc" but they are just harmless cruft).

Setting the environment for HTCondor

It is often desirable to dynamically modify the content of the condor submit file (typically having the JDL extension). While it does not appear possible to access the shell environment variables within the submit file directly, a similar effect can be obtained by setting the internal HTCondor parameters on the command line, cf:

condor_submit A=100 foo.jdl

Then, one can access the value of "A" within the JDL file as $(A).

To find a number of idle jobs:

/usr/bin/condor_q 2>&1| tail -1 | cut -d' ' -f 7