The Kill Corporations Enterprises

Conversation

potentially hazardous object

apophis

does anyone have a good way to do the following as a CLI command?

1. take two plaintext files, A and B

2. in A, look for a beginning and ending tag (like "<head>" and "</head>" or something but i'm not necessarily working only with web pages)

3. delete whatever is (if anything) between those two tags and insert all of B in that place instead

5

2

3

GNUkko Sauvage (eris-ng)

Reply to @apophis

Hmmmm, I feel like this should be possible with ed, and thus with sed, but idk how painful it would be.

2

0

1

melo

moonsea@omg.sayitditto.net

Reply to @apophis

@apophis i wouldn't know but to me it sounds like this could be done with awk

0

0

1

potentially hazardous object

apophis

Reply to @eris@p.enes.lv

@eris i was trying to look up how to do this in sed a couple weeks ago but got hopelessly stuck at the "look for" part

the fact that i don't know how to use regexes (and they seem to be a different format depending on what exact software you're using so i can't trust anything i find to work???) might be the sticking point

1

0

1

potentially hazardous object

apophis

Reply to @apophis

fwiw i can do this in ZScript...

1

0

1

GNUkko Sauvage (eris-ng)

Reply to @eris@p.enes.lv

Ugh them being line oriented makes stuff much less fun

CC: @apophis@kill-corporations.enterprises

1

0

1

💙🩷💜Ⓑⓡⓔⓣⓣ🐡🍉🐧

brettm@swarm.coiloptic.org

Reply to @apophis

@apophis@kill-corporations.enterprises

Below is a script that I use to take 'raw' website content and add a header and footer.

Under that are awk and sed commands for lines 'before' or 'after' a specific string.

You can play around with these to get what u want 🙂

-----#!/bin/ksh

processor() {
list=$(ls | grep raw )
for i in $list; do
j=$(echo $i | sed 's/raw/html/')
echo $j
cat /home/website/header $i /home/website/footer > $j;
done
}

cd /home/website/ &&
processor &&

----only print the lines after the line containing x string:

(eg string is 'Forecast'):

cat xxx | awk '/Forecast/{p++;if(p==1){next}}p'

-----Only print the lines up to the specified string "Forecast"

cat xxx | sed -n '/Forecast/q;p' xxx

----

1

0

2

GNUkko Sauvage (eris-ng)

Reply to @eris@p.enes.lv

I think sed (or ed) is not the right approach here, just implement the thing in your favorite scripting language, but if you want to see the disgusting thing I wrote:

─ cat fileA      
hello
world
how are you doing on this fine day
I am doing quite START fine 
what about you?
oh yeah whatever
blabla END
honestly i dont care i am a meanie beanie muhaahahaha
END
damn, another end huh
─ cat fileB
this is the 
new content
in between the tags :)
─ ed -s fileA
# This is all typed in, you can save it in a file and do ed -s < edscript
# Add newlines after STARTs and before ENDs
g/START/s/START/START\
/
g/END/s/END/\
END/
# Delete everything between the first START and the next END
1;/START/+1;/END/-1d
# Read in the other file in between
-1r fileB
# Get rid of newlines after STARTs and before ENDs
g/START/.;+1j
g/END/-1;+1j
w
q
─ cat fileA
hello
world
how are you doing on this fine day
I am doing quite STARTthis is the 
new content
in between the tags :)END
honestly i dont care i am a meanie beanie muhaahahaha
END
damn, another end huh

Anyways if you don't want to write your own script and would rather have a shell script with sed and such, you should probably do something similar to what I did:
1. add extra newlines similarly to what I did, but use some sed s/START/START\n/ (idk if that \n is correct)
2. use awk to filter out the unwanted content or perhaps split the file in pre-head and post-head parts, I'm not sure sed is able to store enough context to do it
3. cat pre-head new-head post-head to join them together
4. optionally if it's important, run a sed to remove extra newlines added in step #1 (for HTML head tags it doesn't matter)

CC: @apophis@kill-corporations.enterprises

0

0

1

GNUkko Sauvage (eris-ng)

Reply to @apophis

(If it's fast enough for you and easy to run from CLI and you'll have access to it on all your machines, just do this even if it might seem absurd.)

1

0

0

GNUkko Sauvage (eris-ng)

Reply to @brettm@swarm.coiloptic.org

Edited 5 months ago

This is a way to do it, but care should be taken if cutting out strings from generated HTML, there might be stuff around <head> or </head>on the same line which you might want to leave in or cut out, so you should also do some stuff like s/^.*Forecast/\1/ (and the analog in awk).

BTW to me your post feels like a MIME formatted email opened in a mail client that doesn't support it xdd (because of all the dashes)

CC: @apophis@kill-corporations.enterprises

1

0

2

potentially hazardous object

apophis

Reply to @eris@p.enes.lv

@eris problem is i can't write any output to a file

i did find the find() and seek() functions in python though which work similarly

1

0

1

GNUkko Sauvage (eris-ng)

Reply to @apophis

Regarding regex formats, sed uses the ancient POSIX Basic RE, you should always add -E to it on commandline to get a more modern looking regex (POSIX ERE / Extended Regular Expressions). Unless you use fancy stuff like backreferences, lookaheads, or such, basic regex knowledge from Perl/Python/Java/Javascript/anything-from-this-century should be transferrable to it

0

0

1

GNUkko Sauvage (eris-ng)

Reply to @apophis

Do you need help with implementing the thing in Python in a more normal way than what I presented or would you prefer to do it yourself?

1

0

0

potentially hazardous object

apophis

Reply to @eris@p.enes.lv

@eris i'll see if i can give it a try myself for now, but thanks

0

0

1

💙🩷💜Ⓑⓡⓔⓣⓣ🐡🍉🐧

brettm@swarm.coiloptic.org

Reply to @eris@p.enes.lv

@eris@p.enes.lv @apophis@kill-corporations.enterprises
oh yeah I did not consider stuff on the same line!

I use dashes a lot to divide up my notes, they came from my fingers not any mail formatting 🙂

1

0

1

GNUkko Sauvage (eris-ng)

Reply to @brettm@swarm.coiloptic.org

Oh yeah yeah the MIME format (and multipart/form-data, just remembered) is different, that's just what it reminded me of. Was meant in a positive sense like "silly similarity I saw".

CC: @apophis@kill-corporations.enterprises

0

0

0

potentially hazardous object

apophis

Reply to @apophis

problem solved, call me hercules because im baby wrangling python

0

0

0