Forum Archive

'cloud' module

[deleted]

The project is hosted here on GitHub

@ccc said:

GitHub can be confusing... There are a lot of utilities in Pythonista-Tools to help but it can still be complicated.

I have the idea to make a 'cloud' module and a first function 'Import' in it. The goal would be to make the entry curve to using code hosted on GitHub much easier.

For example... instead of using import pythonista.editor you would use cloud.Import('pythonista.editor').

An online 'plist' will contain the mappings of module/package names to GitHub URL. The function will look up the correct URL, check for first time use or new version available... and if so, download the Zip, and unpack it.

Finally it would import as normal.

[deleted]
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>pythonista</key>
    <string>https://github.com/The-Penultimate-Defenestrator/Pythonista-Tweaks</string>
    <key>pythonista.app</key>
    <string>https://github.com/The-Penultimate-Defenestrator/Pythonista-Tweaks</string>
    <key>pythonista.editor</key>
    <string>https://github.com/The-Penultimate-Defenestrator/Pythonista-Tweaks</string>
    <key>pythonista.console</key>
    <string>https://github.com/The-Penultimate-Defenestrator/Pythonista-Tweaks</string>
    <key>Gestures</key>
    <string>https://github.com/mikaelho/pythonista-gestures</string>
</dict>
</plist>
[deleted]

The above 'plist' is for example and development purposes.

Webmaster4o

I talked recently about making an all-in-one service for this. It'd provide an interface for uploading scripts, and an interface for browsing and downloading scripts. It would revolve around a GitHub repo, but you'd never deal with git, you'd only look at the pretty interface that I designed. Traffic would be routed through my website so that I didn't have to expose the password to the GitHub repo in the code of the script. The GitHub repo would just be used for storage because I have only limited storage on my webserver. Your idea could also be neat. If I ever implement my idea, I might use a similar interface for importing the scripts.

[deleted]

@Webmaster4o I like your idea a lot. :)

wnMark

Maybe offtopic!

I had two dreams this night.
1. I wish to have Dua Lipa will sing on my birthday in November this year!
2. I want to have a direct and realtime communication between an Pythonista App on iPhone and another Pythonista App on iPad.

What do you think, which dream can become real?

[deleted]

I've got as far a downloading and extracting the Zip file...

P.S. Extracting all is temporary to see what's in the archive... I will aim to extract just the folder direct to site-packages.

[deleted]

Updated code that's tested with both package and module and updates site-packages (no version checking or actual imports yet)

[deleted]

True 'imports' are working now (using inspect to get the namespace of the module doing the cloud.Import).

To do: version checking, better working directory

import cloud

cloud.Import('pythonista.editor')
cloud.Import('Gestures')

pythonista.editor.WebTab().present()
g = Gestures.Gestures()
Webmaster4o

Wow, you actually have it going through the XML that you posted here 😂

ccc
DOCS_DIR = os.path.expanduser('~/Documents/')
SITE_DIR = os.path.join(DOCS_DIR, 'site-packages/')

Also, why a plist and XML when you could go for a more human friendly format like yaml or json?

[deleted]

@ccc Actually, I thought you'd comment on the last if... else, we can maybe squeeze 3 source lines out of that. :)

You're a fan of fewest source lines I think, but your DIRs add 2 lines, or am I mistaken?

Plist and XML ... well use what you know for one thing, but just 3 source lines to get the URL isn't pretty neat?

I don't know yaml or json... by comparison, how many source lines to do it in them? You'd have to give an example to convince me. How more human friendly are they? Does it matter much when the user doesn't see it?

dgelessus

JSON is structurally very similar to Python dicts and lists containing basic data types. This means that converting JSON to Python objects and back is very easy. For example you might write the plist from the first post as JSON like this:

{
    "pythonista": "https://github.com/The-Penultimate-Defenestrator/Pythonista-Tweaks",
    "pythonista.app": "https://github.com/The-Penultimate-Defenestrator/Pythonista-Tweaks",
    "pythonista.editor": "https://github.com/The-Penultimate-Defenestrator/Pythonista-Tweaks",
    "pythonista.console": "https://github.com/The-Penultimate-Defenestrator/Pythonista-Tweaks",
    "Gestures": "https://github.com/mikaelho/pythonista-gestures"
}

Then you can load the file locally like this:

import json

with open("modules.json", "r") as f:
    modules = json.load(f)

Then modules will be a dict that looks like the JSON file. If you want to read it from a web server:

import requests

modules = requests.get("https://example.com/modules.json").json

This is even shorter, because requests has built-in support for receiving and sending JSON.

[deleted]

@dgelessus It may be worth considering when the source is on a regular web server, but I think not just yet while its hosted in the forum. ( 2 1/2 of the 3 lines are for retrieving the XML or JSON from the forum)

dgelessus

If you put the JSON file in a GitHub repo, you can access it "raw" as if it were on a web server. For example, this is the raw link to the README file for stash: https://github.com/ywangd/stash/raw/master/README.md

Obviously you wouldn't want to download that file automatically, but you get the point. It's really not hard to put a file on GitHub and it saves the annoying HTML parsing.

[deleted]

@dgelessus I have more in mind @Webmaster4o hosting it on his web server if he's willing, as we discussed above about his overall vision for these things.

Tizzy

+1 for json!!!!!!! I'm not comfortable with working with xml in python.

so let me understand this, If I misunderstood please do correct me.

the idea is that you want to import things and have them automatically download if they are not already present in site-packages?

Would there be a central location that would list "approved and working" modules where the thing would check? Or would you risk downloading some malicious code because you mis-spelled a module name?

I do love the overall sentiment of centralizing and consolidating something as basic as installing packages.

[deleted]

@Tizzy Only the modules in the list (whether XML or JSON, JSON will be marginally better) will work... nothing misspelled or otherwise will do anything.

As a user of cloud.Import you don't need to know or care about the XML/JSON and you don't see it.

ccc

You're a fan of fewest source lines I think, but your DIRs add 2 lines, or am I mistaken?

I do not subscribe to the "fewest lines automatically wins" philosophy but I do love to be concise in my code thus my love for "ten lines or less".

  • Code is more often read than written so... "Readability counts"
  • Don't Repeat Yourself (DRY)
  • The more concise your code is, the easier it is to understand and to debug but it must be readable
  • Having clean lines is just as important as having few lines.
  • Breaking your work into functions and methods will makes your code more understand and reusable.

So "wasting a few lines" to calculate DOCS_DIR and SITE_DIR up front will make the code easier to read, easier to debug, more concise and more efficient (DRY). This is all goodness in alignment with the Zen of Python.

Plist and XML ... well use what you know for one thing, but just 3 source lines to get the URL isn't pretty neat?

My favorite friends are the ones that push learn new things ;-) I think @dgelessus is trying to do that above. His idea about putting your data file on GitHub is a good one.

I don't know yaml or json... by comparison, how many source lines to do it in them? You'd have to give an example to convince me.

@dgelessus has started this above but if put your code in a GitHub repo, we can submit pull requests for you to consider.

How more human friendly are they?

A ton more human friendly (and more efficient for computers to parse too). When I worked at Sun Microsystems, I met Jon Bosak the "Father of XML" a few times and I have the utmost respect for his work. XML and its decedents have had a tremendous positive effect on computing over the years but today for these kind of projects we have better tools than XML.

Does it matter much when the user doesn't see it?

Oh yes it does!! If your goal is to build systems that last then take the time to make your work easy for yourself, your maintainer and your system administrator, as well as for your end user.

dgelessus

Juat want to say that XML is not terrible or absolutely useless in every case. If you use XML as a markup language (remember, it's called eXtensible Markup Language) then it is definitely a better choice than JSON. For example, say you have this fictional HTML ripoff:

<document>
<p>This is the first paragraph.</p>
<p>This paragraph contains <b>bold</b> text.</p>
</document>

How would you translate that to JSON? Even if you assume that the top-level element is always document and that all subelements are p, you get something like this:

[
    ["This is the first paragraph."],
    ["This paragraph contains ", {"style": "bold", "text": "bold"}, " text."]
]

I would prefer the XML version. Especially if your document format becomes larger and you might have different "paragraph" types, then each paragraph would need to be an object too.

Point is, use the formats for what they are good at. XML for text markup and document-like formats, JSON for structured data made of basic data types.

Webmaster4o

@dgelessus For this purpose I think JSON works better though. JSON plays much more nicely with Python, it can be directly converted to Python dicts and lists, where XML requires complex ElementTrees. XML is also vulnerable to many types of attacks that JSON (to my knowledge) is not. See a list of the Python xml module's vulnerabilities, which includes malicious attacks like billion laughs.

omz

JSON plays much more nicely with Python, it can be directly converted to Python dicts and lists, where XML requires complex ElementTrees.

Well, there is the handy xmltodict module...

dgelessus

@Webmaster4o Agreed, for this case JSON is better IMHO. And no, because JSON is a very simple format, it doesn't support external references or definitions that could be used for easy DOSing.

Webmaster4o

@omz Didn't know about that one. There's still the vulnerabilities, though.

omz

@Webmaster4o Yeah, I wasn't really arguing for XML, but in some cases you have to use it because there's an existing API, and this can make life easier... As for vulnerabilities, that doesn't really matter in a package manager scenario imho. If you can't trust the source, you're screwed anyhow because it would be much easier (and less suspicious-looking) to inject malicious packages than to exploit parser vulnerabilities...

[deleted]

@omz said:

Well, there is the handy xmltodict module...

Thanks for the tip.

[deleted]

@guerito said:

@dgelessus [JSON] may be worth considering...

Was agreed quickly... but this is hardly a priority or such a big deal.

Yes, it was great to be pushed to learn about namespaces and imports, that was totally new to me, and now JSON too :)

The next priority for me is the version checking.

I'm a fan of @Webmaster4o because his comments and contributions always seem to have the vision and goal of a project in mind rather than trivialities.

Phuket2

@guerito, I agree about @Webmaster4o , I have a feeling he will be a big name in tech some day. To me it appears he has a very balanced rationale whilst being very smart and astute. Way beyond his years.

ccc

I had never used xmltodict but it seems highly effective...

import xmltodict

data_as_xml = '''<?xml version="1.0" encoding="UTF-8"?>
    [ ... the XML from above ... ]
</plist>'''

the_dict = xmltodict.parse(data_as_xml)['plist']['dict']
the_dict = {the_dict['key'][i]: the_dict['string'][i] for i in range(len(the_dict['key']))}
print(the_dict)
[deleted]

@dgelessus Could you modify your forum post with the second XML block as I don't know if you realised but this breaks the code?

Meanwhile the following fix is needed:

if s[:5] == '<?xml': 
            urlZ = plistlib.readPlistFromString(s)[sTarget] + '/archive/master.zip'
            break
ccc

@guerito You can edit your own post by tapping on the three dots at the upper right of the post and then tapping edit.

[deleted]

@ccc Of course, but this is not my post, it's @dgelessus 's

[deleted]

@Webmaster4o I have version checking implemented now... So I'm pretty much done with making a first version and this could start to become a community project.

Would you be willing to host the cloud module on your GitHub like you do the pythonista module?

You could add your other ideas into the module if you liked or copy them over to one of yours.

dgelessus

@guerito Why not create a GitHub account for yourself? It's completely free, and the GitHub web interface is very easy to use, even if you don't know anything about Git in the command line.

I'll edit the block of XML so it (hopefully) doesn't get recognized by your script anymore. But please, don't try to use a forum thread for any serious data hosting, as you can see it is quite annoying for everyone involved. Use GitHub, Gist, Pastebin or some other service that is designed for the job.

[deleted]

@dgelessus Well done.

Webmaster4o

@guerito Sure. I can tonight. Send me the code ;)

[deleted]

@Webmaster4o Great! Thanks as ever. Will do.

Webmaster4o

Here it is. I'll put some work into it as far as implementing my own ideas soon. I've got to finish writing the help menu for this first.

[deleted]

Updated the first topic post to point to the project on GitHub

Webmaster4o

@guerito Does it really check for updates to the existing modules every time a module is imported? I think there should be a separate cloud.update() function that updates the already-downloaded modules.

I haven't looked at this extensively yet, but the way I understand it, this should be used not just for downloading modules for the first time but also import them later.

[deleted]

@Webmaster4o Good point. My thinking was to have a module default of say 'daily' that could be changed to 'hourly' or something.

While the list (in XML or JSON) is hosted on the forum that's slow to get I guess, because the forum returns all the topic posts even if a single post was referenced in the URL. If you move the list to your web server it will be more efficient I think.

JonB

i like this idea... but why not allow generic github repo names, and forgo the plist altogether? That way, no maintenance is required, plus it allows new modules to use this without having to first go through a pull request to whoever is maintaining the list...

cloud.Import('mikaelho.pythonista-gestures')

The xml/plist whatever does not seem to add any value except for a few shorter characters.

For extra utility, allowing a specific branch/version callout, and/or specifying a folder or file to import, which would make it a lot easier to load other external github repos. which are often in pypi format having a top level folder with license/readme/setup.py, and an actual code folder elsewhere.

[deleted]

@JonB You will need, I think, a way to figure out the module/package name from the repo... for example with pythonista tweaks... to know it's the pythonista folder and not examples... both for knowing what to copy to site-packages and to know what to import at the end.

P.S. you could look for the __ init__.py to find the package folder... but still how to know the sub-module required to import e.g. editor, console, app etc

dgelessus

Normally there is only one folder in a repo that is a valid package (i. e. has an __init__.py in it), so once you've found that it should be clear what the package's name is and where the submodules are. Some repos also have the __init__.py and all submodules at the top level, in that case there's no way to tell what the "real" package name is, but the repo name will probably work.

[deleted]

@dgelessus

@guerito said:

@JonB ... but still how to know the sub-module required to import e.g. editor, console, app etc

JonB

I would think as a separate argument. i.e

Import(gituser, gitrepo, submodule)

or possibly
Import(gituser,gitrepo, submodule=None, package=None)

dgelessus

Are GitHub user/repo names allowed to contain dots? If not, then user, repo, module = instring.split(".", maxsplit=2)

Webmaster4o

@dgelessus Yes. The name of the repo has a . in it. It's called "pythonista.cloud" 😂

Webmaster4o

I'm beginning work on this now. I've created the repository pythonista.cloud-database with my main GitHub account, and I've created a new account, pythonista-cloud-access which is added as a collaborator. This account will be used by the web service to access the repo. This has two advantages:

  • It only has access to one repository. An attacker who found the login could only modify the database, not any of my other repos
  • Collaborators cannot delete repos they do not own. An attacker could delete all the files from the database, but they would still be in the commit history. No irreversible changes could be made from this account.
[deleted]

@Webmaster4o Way to go! :)

dgelessus

@Webmaster4o I'm no Git expert, but I think you can git push --force to do destructive changes to a remote Git repository that you have write access to. The only reason I had to use it was to "squash" commits in a PR (i. e. combine multiple commits into one, so they look nicer in the main repo's history) but I'm sure you can do worse things. And even if that doesn't work, you can make a new branch off the initial commit, make it the default, then delete the old master branch and poof goes your version history. So do please take care of the account and keep extra backups. :)

Webmaster4o

@dgelessus Yeah. I don't think there's any way to get the page source for a PHP page. So it should be secure. I will, of course, keep backups. This is just in case, I think it's pretty secure. The password is also one of Safari's "suggested passwords", which are alphanumeric with special characters, and completely random. It's not what one might call "guessable"

[deleted]

@Webmaster4o Now that you've started to work on your broader vision for this - I've updated the Topic to be 'cloud' module

Phuket2

Thanks guys, that cloud talk, reminded me to buy Phuket2.cloud

Webmaster4o

Has anyone used and liked any git libraries for Python, PHP, or Node.JS? (I've never actually used Node.js but I know vanilla js)

So far I've been let down by:
- Python gittle (got through like 5 errors and I gave up when I got to a sixth)
- Python dulwich.porcelain, (but I haven't tried looking deeper into the module).
- Just running git terminal commands with shell_exec(PHP) or os.system (Python). (I can't figure out how to pass authentication unless I'm actually in an interactive prompt)

If anyone can recommend a good native library for git in any of these three languages, it'd be much appreciated. Thanks 😃!

dgelessus

If you want to run shell commands from Python, never use os.system. There is the subprocess module, which is safer, easier to use and has many convenience features (such as standard stream redirection).

Webmaster4o

I'm now leaning away from using GitHub at all, there's really no point. I don't need collaboration, and I'm using my server anyway. I think I'll host the code directly on the server, even though I've only got 20GB. I'll probably limit individual files to 50KB each, and entire multi-directory projects to 3MB zipped. If you try to upload a larger file, the app will prompt you upload it to GitHub, and then it will just store a remote URL.

brumm

CreateRepo don't know if this helps? What are you looking for?

Webmaster4o

@brumm Nah, I'm good now. Trying to figure out how to make an API for uploading files. I think I'm gonna have b64encode-ed data passed in the URL query string.

Webmaster4o

Initial tests have been bumpy. Just testing POST requests for file uploads with PHP, I lost some data. The before and after with an image (converted to .txt afterwards):

All the strange characters got lost and converted to unicode characters. So that's something to fix. Uploading ASCII files works well.

Webmaster4o

I'm also trying to rewrite it to allow for things like from cloud import gestures. The way I'm doing this is as follows:

# coding: utf-8

'''
cloud.py 

Vision: 

- cloud.Import: to make the entry curve to using code hosted on GitHub much easier

Credits: 

- cloud.Import: idea and first version by @guerito, future versions on @webmaster4o's GitHub

'''

class PseudoModule: 
    def Import(self,sTarget):
        #Load modules into local namespace
        for code in self.bs4.BeautifulSoup(self.urllib2.urlopen('http://forum.omz-software.com/topic/2775/cloud-import').read()).find_all('code'):
            s = code.getText()
            if s[:5] == '<?xml': 
                urlZ = self.plistlib.readPlistFromString(s)[sTarget]
                break
        d = dict()
        if self.os.path.isfile(self.os.path.expanduser('~/Documents/site-packages/' + 'cloud.pkl')):
            with open(self.os.path.expanduser('~/Documents/site-packages/' + 'cloud.pkl'), 'r') as f:
                d = self.pickle.Unpickler(f).load()
        s = self.html2text.html2text(urllib2.urlopen(urlZ).read())
        i = s.find('commits')
        iNow = int(s[i - 4:i - 1])
        try:
            iOld = d[sTarget.split('.')[0]]
        except:
            iOld = -1
        d[sTarget.split('.')[0]] = iNow
        with open(self.os.path.expanduser('~/Documents/site-packages/' + 'cloud.pkl'), 'w') as f:
            self.pickle.Pickler(f).dump(d)
        if iNow > iOld:
            self.console.hud_alert('updating ' + sTarget + ' ...')
            urlZ += '/archive/master.zip'
            sZ = self.os.path.expanduser('~/Documents/'+  urlZ.split('/')[-1])
            self.shutil.copyfileobj(self.urllib2.urlopen(urlZ), open(sZ, 'wb'), length=512*1024)
            with open(sZ, 'rb') as f:
                for member in self.zipfile.ZipFile(f).namelist():
                    l = member.split('/')
                    if len(l) <= 2: # module
                        if l[-1][-3:] == '.py':
                            self.zipfile.ZipFile(f).extract(member, self.os.path.expanduser('~/Documents/'))
                            self.shutil.move(self.os.path.expanduser('~/Documents/'+ member), self.os.path.expanduser('~/Documents/site-packages/' + l[-1]))
                    else: # package
                        if l[1] == sTarget.split('.')[0] and l[-1] != '':
                            self.zipfile.ZipFile(f).extract(member, self.os.path.expanduser('~/Documents/'))
                            if not self.os.path.exists(self.os.path.expanduser('~/Documents/site-packages/' + l[-2])):
                                self.os.mkdir(self.os.path.expanduser('~/Documents/site-packages/' + l[-2]))
                            self.shutil.move(self.os.path.expanduser('~/Documents/'+ member), self.os.path.expanduser('~/Documents/site-packages/' + l[-2] + '/' + l[-1]))
            self.shutil.rmtree(self.os.path.expanduser('~/Documents/' + l[0]))
            self.os.remove(sZ)
        locals()[sTarget.split('.')[0]] = self.importlib.import_module(sTarget.split('.')[0])
        if len(sTarget.split('.')) != 1: locals()[sTarget.split('.')[1]] = self.importlib.import_module(sTarget)
        #Reload existing modules
        reload(locals()[sTarget.split('.')[0]])
        if len(sTarget.split('.')) != 1:
            reload(locals()[sTarget.split('.')[1]])

        self.inspect.currentframe().f_back.f_globals[sTarget.split('.')[0]] = locals()[sTarget.split('.')[0]]

    def __getattr__(self,name):
        #Trying to use a built-in module (internally)
        if name in ['bs4','urllib2','shutil','os','zipfile','plistlib','importlib','inspect','html2text','pickle','console']:
            return __import__(name)
        #Trying to use a cloud module
        else:
            try:
                return self.Import(name)
            except KeyError:
                return None

import sys
sys.modules[__name__] = PseudoModule()

This doesn't work, though. I'll keep trying.

Webmaster4o

Just got through a successful image upload and download with no lost data. Going to bed.

Phuket2

@Webmaster4o , I don't think it's a good idea to go away from Github. I talk about your future as a developer/manager/entrepreneur . You will need to collaborate with others. I sort of hate github. But it's because I am too lazy to really invest in it, or I am not smart enough to work it out. But in my old role managing many developers, this would be a must have skill today. Again, I think it's (Github) is lacking, but it's from my point of view only. Please consider to stay with Github and make it 2nd nature for you.
It's a little hard to explain 100%, but I think all the professionals here would agree that managed code (sever based, forks, versions, etc..) is a must have. The tools will just get better and better.

[deleted]

@Webmaster4o @Phuket2 is kind, well meaning and right on one level.... but if we never launched out and did something new... we'd still be computing with mainframes and punched cards.

You have plenty of GitHub skills, more than enough for your career.

Maybe you will become the entrepreneur that develops the GitHub killer!

Anyway... have the courage of your ideas and decisions. Whatever route you take... way to go!

Webmaster4o

@Phuket2 I'm not going away from GitHub! I love GitHub, use it all the time. I've just decided not to use it to host people's code for the pythonista-cloud service, because it doesn't really make sense. I need a receptacle for other people's code where it can be accessed quickly and easily, that's all. The code people upload to Pythonista-cloud doesn't need collaboration, it just needs to be downloadable easily.

The source code for pythonista-cloud will still be on GitHub, even the server-side code.

Phuket2

@Webmaster4o , ok fair enough.i don't love Github, but I see its significance.

Webmaster4o

@Phuket2 I love it. If you get to know the git protocol, GitHub starts to make a lot mores sense. The easiest resource for this that I've found is this guide. You probably only have to read up to through "update and merge," so it's about a 5 minute read. Then, I encourage you to try the examples.

I use git almost exclusively through the command line. I google lots while I do it though :) In one week of working on a project that I tracked with git, I got the hang of git pretty well.

After a little bit of setup, the only thing I really ever do is run

git add *
git commit -m "Commit name" -m "Commit message"
git push origin master

This will add all your changes, then you commit, and then you put it on github. Of course, it's much more powerful than this, but these are the main things you'll need to do.

I know you're lazy, but I encourage you to put about an hour into wrapping your head around git. It's really useful.

Tizzy

[hijack] while we're on the topic of github, what are your workflows for the given situation:

fork repoa.
make changes to fork/repoa
pull request on repoa

(rep and fork/repoa are out of sync because additional pull requests have been made on it and changes pushed to it)

how do you bring your fork back up to date?

(The easiest solution I could find was to delete my fork then re-fork. PS I'm using source-control UI of Xcode)

JonB

generally you would do a git fetch upstream, git merge upstream

[deleted]

See here for sample JSON to replace the XML above. The JSON is hosted using cloud.File.

Phuket2

@Webmaster4o , sorry I missed your reply here. Lots of posts these days. It's great.
I assume your comments were mostly referring to desktop usage of pip? For me if I never have to use my desktop again, I would be happy. Just meaning, more I can do on my iPad the better. I think I also psyched myself out to some degree regarding Pythonista. For a long time there was very little talk for to get around these limitations in regards to Pythonista. That seemed to make seem even more daunting.
But thanks for the reference, I will give it a read and try the examples. Oh, I am only lazy when I want to be 😱 After all I am retired, I am allowed to lazy 😁