Commit graph

34 commits

Author SHA1 Message Date
Jesse Becker
2d92a0369f Simplfy gravity_advanced.
Remove extraneous calls to several programs (cat, uniq).
2015-09-12 00:26:29 -04:00
Jesse Becker
e1b9ed450c Remove duplicate -s in curl command 2015-09-12 00:08:43 -04:00
Jesse Becker
a5f2305947 Check if files are readible, not just present 2015-09-12 00:05:19 -04:00
Jesse Becker
1e7f843993 Simplify (and speed up slightly) awk/sed domain name extraction 2015-09-12 00:03:56 -04:00
Jesse Becker
47abe65090 Store block lists in temp file, intead of RAM.
Storing the output from 'curl' commands directly as shell variables is
very inefficent, and requires much more RAM gravity.sh any time there is
an update to the block lists (and especially on the first run).  Store
the raw blocklists in a temporary file on disk, and process those.
2015-09-11 23:26:25 -04:00
Jacob Salmela
e19a6c3624 Merge pull request #38 from korhadris/master
Fixes #32 and fixes #35
2015-09-06 10:11:39 -05:00
jacobsalmela
fa77b7b69d increase swap to fix #37 memory error
This will increase the swap file to 500MB before downloading the lists.
 Most of the issue comes from the mahakala list, which is so large.  If
no swap file is found, one is created.
2015-08-25 18:01:54 -05:00
korhadris
98c94912e1 Replace use of grep -w with grep -x.
Prepend "^" to start of latentWhitelist.txt lines.

The -x switch requires a full line match of the regexp, where as -w
will try to find the match somewhere in the line, looking for work
breaks. Combined with turning the whitelist lines into full regexps,
this results in significantly faster parsing.

Having "^" prepended to the lines also keeps false whitelisting from
occuring, such as the following example:

If whitelist.txt contains "google.com" it would whitelist many other
sites that end in "google.com" as long as there is a non-word
character preceeding the google (such as "-", or ".").
2015-08-22 23:37:01 -07:00
korhadris
a26377d229 Append ad list sources to latentWhitelist.txt to prevent them from being filtered.
Additional fixes for #35. This will prevent our own sources from being
filtered out by competing source lists.
2015-08-22 21:44:41 -07:00
korhadris
e464c04490 Ignore domains in ad lists that do not contain . characters.
This will skip entries such as `localhost`, `android`, `debian` and
empty lines as listed in #35.
2015-08-22 17:47:22 -07:00
korhadris
bb7db11214 Changing printouts when updating sources to tell what is going on when
manually running gravity.sh

This will print "Getting $domain list... " for each domain, followed
by either "Done" if data was received and validated, or "Skipping
list because it does not have any new entries" if no updates were
needed.
2015-08-22 17:33:30 -07:00
korhadris
1f29d01694 Remove leading and trailing whitespace and . characters and
duplicate `.` characters as each list is stored.

Should fix #32.
2015-08-22 17:05:19 -07:00
korhadris
d6d192cb0a Use url variable to store ${sources[$i]} value to improve readability.
I also wanted to replace the for loop iterating over indices with
something like:

`for url in $sources[@]}`

It made the use of `$i` in the save location more annoying though.
2015-08-22 16:22:07 -07:00
korhadris
0ec6eab683 Appending ".$justDomainsExtension" to $saveLocation variable.
Every use of $saveLocation was adding this and making lines
longer.
2015-08-22 16:04:54 -07:00
korhadris
159b29b80b Replace spaces with tabs to make indentation consistent within the file. 2015-08-22 15:56:32 -07:00
Fourdee
9d99a4ef36 Patch 3 - Dont use /etc/hosts
/etc/pihole/gravity.list now stores the block list. Ensures the
/etc/hosts file is left untouched.
2015-07-30 17:24:24 +01:00
jacobsalmela
563db80b6d resolves #25
Will not count blank lines if they happen to exist.
2015-07-17 20:49:03 -05:00
jacobsalmela
8f961c1aaa resolves #22
This lets dnsmasq re-read the hosts file without disturbing the daemon.
 So any new entries can be used as soon as gravity.sh is finished
running
2015-07-17 13:05:38 -05:00
rmceoin
f6ccb4b658 Merge remote-tracking branch 'upstream/master' 2015-07-13 09:55:53 -07:00
rmceoin
37e926ce84 Parses host only file formats now. Previously only handled hosts file format.
Specifically, it can now handle the following list:
'http://mirror1.malwaredomains.com/files/justdomains'
2015-07-13 09:28:45 -07:00
jacobsalmela
5c4bfb84b0 uses a variable for hostname instead of raspberrypi
Some people use a hostname other than raspberrypi, so their hostname
did not resolve to 127.0.0.1.  I replaced that hardcoded value with a
variable so that does not happen.

I also added a few comments and minor formatting adjustments,.
2015-07-13 06:59:22 -05:00
rmceoin
552f980430 blacklist was being concatenated with wrong matter 2015-06-22 13:33:02 -07:00
rmceoin
66bb0e7bb3 Strip carriage returns on matter so that whitelists work correctly. Lines that had \r would not match. 2015-06-22 13:03:15 -07:00
rmceoin
e9324f8316 Use double brackets for the test. 2015-06-19 17:54:12 -07:00
rmceoin
67aba8c496 If exists, import a config file to allow for overriding script variables. 2015-06-19 13:31:51 -07:00
jacobsalmela
01ac3c1dd3 Ditching the use of the loopback
Pushing files so they are available when the new article gets posted.

If the Pi's loopback is set in the hosts file, clients using it as a
DNS server will try to connect to their own loopback, which does not
have a Web server.  So the real IP of the Pi is used.  It is
recommended to use a static IP since this will be acting as a server.

Made one small change from some hard coded values to a variable.
2015-06-13 22:01:12 -05:00
jacobsalmela
c563841714 changing the origin folder
Originally, I had this set to /run/shm (in RAM) but ran into errors
when the list reached 900,000 entries.
Then I moved it to /tmp.
Finally, I decided to just put the files in the pihole dir so they are
available after reboots.  This will help with only downloading the
lists when absolutely needed--respecting the bandwidth of the people
serving the lists.

It is also possible to add addn-hosts=/path/to/hosts.conf within the
dnsmasq.conf file if you don't want to use hosts.  For simplicity and
speed, I just use the regular hosts file.
2015-06-06 23:34:32 -05:00
jacobsalmela
457b70f5c5 add IPv6 support in the hosts file
Still need to get lighted to use IPv6.  I am doing this because some
ads can get through using IPv6 if the IPv4 version is blocked.  Also,
it seems to work fine as far as performance even though it doubles the
file size...

Also added a few comments for better documentation.
2015-06-04 08:21:44 -05:00
jacobsalmela
61c99ff145 forgot to change origin dir 2015-05-19 13:32:37 -05:00
jacobsalmela
56c776af22 hosts format script 2015-05-19 13:31:37 -05:00
jacobsalmela
18d6f4b747 initial commit 2014-09-26 09:28:44 -05:00
Jacob Salmela
8569c419c1 added file path 2014-06-10 20:22:26 -05:00
Jacob Salmela
2af19a0194 added cmd to restart dns 2014-06-08 10:14:54 -05:00
Jacob Salmela
2131149fda Create gravity.sh 2014-06-08 10:03:56 -05:00