Proposal for community documentation project

A recurring valid complaint is that the documentation is often outdated or non-existent.
Another issue is that there is nothing that non-coders can do to help move the project forward. Engaging with testnets still has a learning curve which may seem daunting to many - Its not that daunting but leave that for now…

Here is a proposal to tackle that and involve the non-coders amongst the community. This will involve repetitive, unexciting unpaid work and downright cut n paste slog. But if you believe in this project, its got to get proper multi-language documentation sooner or later.

We get ChatGPT^ to do the comments for us. output its response in Markdown and various languages and create a body of commented code alongside the actual source code.
The standard of these ChatGPT comments is open to debate but lets move forward on the premise that some documentation is better than no documentation at all. And also that repeated regenerate response will both train the AI bot and get us better commentary.

The concept is well-received by @dirvine https://forum.autonomi.community/t/using-ai-to-help-code/37761/65?u=southside

To make this work needs a team, which I will NOT head, cos I am busy with scripts right now. We need maybe 7-8 more folk who can devote an hour or two at a time for a few weeks to ask ChatGPT^ to comment chunks of code from each .rs file (for starters) from
GitHub - maidsafe/safe_network: The Safe Network Core. API message definitions, routing and nodes, client core api.. There are 225 files within that repo with a .rs extension. And thats just one of the many repos in the overall project.

There are many ways this documentation could be organised - one way may be to fork the repo and rename each .rs file as blah.rs.md where the ChatGPT output is appended to the code. A very tentative start to that is made at safe_network_AI-commented/spentbook_apis.rs.md at main · safenetforum-community/safe_network_AI-commented · GitHub which if anyone wants to take this on is very welcome to use.

The debate on the errant 0.5% from the initial crowd sale has proved that there are many community members who care about this project - and some possible shit-stirrers as well but lets ignore them.
Here is a chance for those who care but have been unable to contribute so far to play their part.

As of late Jan23 ChatGPT cannot access Github and is unhappy if you feed it hundreds of lines of code so its not quite as simple as saying “ChatGPT please comment safe_network_AI-commented/cmds.rs at main · safenetforum-community/safe_network_AI-commented · GitHub and prepare your response in Bulgarian ready to paste into a .md document” but I am sure a team would quickly find the best way of framing the enquiry to get a suitable response.

Anyway Im just filinging this out there to see if there is any interest.
I cannot and will not commit to doing all this myself- it really needs a team and a team leader who will co-ordinate the necessary work.

^ other AI-bots are available - sadly the one which can access the Internet is marked down as poor on code generation but great for writing PR blurbs

11 Likes

ChatGPT^ puts it better than I can…

Are you looking for a way to get involved in the SAFE Network project but have felt limited by your lack of coding experience? Well, we have the perfect opportunity for you! We’re looking for a team of dedicated community members to help improve our documentation by working with ChatGPT, a powerful AI language model, to generate code comments and documentations in various languages. This is a great opportunity to contribute to the project and make a real impact, even if you’re not a coder! Plus, as we work with ChatGPT, we’ll be training the AI to get even better results.

We need about 7-8 people who can devote an hour or two at a time for a few weeks to help us comment on chunks of code from the safe_network repo on GitHub. There are 225 files in that repo alone, so we need all the help we can get! If you’re interested in joining the team, we’ll have a team leader to coordinate the work and make sure everything runs smoothly. This is a chance for those who care about the project to make a real difference and have a say in its future. Let’s work together to make the SAFE Network’s documentation the best it can be!

^ other AI-bots are available

4 Likes

I don’t know much about ChatGPT. Do you know if there is an API for it? If so, I could probably just script something to do this.

I can’t say I will definitely volunteer do it, as I have a lot on my plate right now, but I’ll see if I can fit it in.

1 Like

Thank you

I have not looked at the official API yet in depth

OpenAI API

Theres also GitHub - acheong08/ChatGPT: Lightweight package for interacting with ChatGPT's API by OpenAI. Uses reverse engineered official API..

Im busy myself, so me taking the lead is not on, I’d only fail and hold us all back.
If it could be done ( or at least most of the grunt work) by using the API then that would be a big win indeed.

I’m also too busy, but it’s a great idea. What do you think would be involved? Is there a stepwise process where someone could step up for a few days and get some of the graft done. Then someone else could step in and refine it? Maybe hard to answer, but breaking it into chunks would make it easier to attract the curious/enthusiastic but time-poor.

2 Likes

Some time refining how to convince ChatGPT to give a comments that are neither too terse nor so detailed as to be unuseable. Once the correct wording is found then hopefully it can be scripted. I really want to get the wallet/DBC/faucet scripts done before I distract myself with the API - and anyway there are others here who are more competent than me to do this.
Then someone needs to coordinate who is doing what and to make sure the whole codebase is covered eventually
AND we should decide which languages we need/want to have or link to some auto-translate tool. I know, I know :slight_smile:

Im sure there is - havent defined it yet though :slight_smile:

Absolutely - even just splitting the job into sn_node sn_api sn_comms sn_cli sn_client sn_interface etc would be a start.

I asked Chat GPT to coment /sn_api/src/common.rs and got this - I had to prod it halfway through which is something scripts may trip up on

TBH I’d say that code was already pretty well-commented but it was chosen cos its only 87 lines long.

Despite specifically asking it not to comment on comments it still gave this.

This is the copyright statement of the code. It specifies that the code is copyrighted by MaidSafe.net Limited and is licensed under the General Public License (GPL) version 3. This means that the code is open-source and can be used, modified and distributed by anyone, but it must be distributed under the same license, and any changes to the code must be clearly marked.

Also it veers from summarizing complete functions to giving a line by line commentary so I think it needs to be asked very precisely. GIGO still applies.
need to read this…
https://beta.openai.com/docs/guides/code

4 Likes

When I signed up it took me directly to an interactive tutorial to create a simple NodeJS app. I didn’t look more at that but think there are quite a few options for anyone wanting to use the API.

1 Like

I will get involved in this when I can when it is up and running :+1:t2:

1 Like

I haven’t played with chatGPT yet but straight up question is it able to actual generate good quality comments based on rust code?

1 Like

I think so - based on what I have pasted above. The trick is to have the comments describe adequately what each line of code signifies unless its just bleedin obvious to a novice rust programmer. Various runs I have done have given me code commented in excruciating line by line detail up to a fairly vague overview of what each function is doing.
We need consensus on what depth of detail is appropriate, then we need to ask ChatGPT nicely to consistently give that detail across the entire codebase.
I am confident this is entirely do-able with sufficient participation and a capable enthusiastic project leader to harness that participation effectively.

2 Likes

Every time iv tried to get onto chat gpt it’s been busy so fully planing to have a play with it.

But am planing to ask it about things I know about before I ask it to tell me about things I don’t fully understand

1 Like

excellent strategy :slight_smile:

I’m happy to be pleasantly surprised (proven wrong) however:

In my experience, good comments explain why code is written in a particular way, not what each line of code is doing. rationale, not how. Only the programmer knows what s/he is thinking in the moment.

Good comments may also mention how a particular section of code relates to something elsewhere, perhaps in another source file.

I think chatGPT will produce comments like those of a junior coder who is instructed to “comment everything”. So one ends up with comments like “iterate over Vec items”, which is essentially just restating the code itself.

I can see some value to it as a first pass for documenting APIs: method parameters and return values, and perhaps method purpose/description. Hopefully a team member would later review and improve though.

7 Likes

Yes this is a potential drawback as I have already seen.

Agree. Half a loaf better than none though

Absolutely - but now we have something for them to review - aaaand we hopefully have a lot more folk at least somewhat more familiar with the code and motivated to look at the rest of the documentation and see where the non-immediately-obvious connections lie.
This is by no means a magic wand that will solve 100% of the documentation problem but it may be good enough to get started.
Remember that this could be an iterative process - perhaps further down the line we can feed ChatGPT or similar the original output plus the entire codebase and tell it to look for the bigger picture as you alluded to earlier.
Possibly it can be an ongoing regular maintenance task? - apart from and in addition to keeping up with the inevitable changes over time

yeah, so if I were hypothetically charged with “fixing docs” for the team, I might approach it this way:

  1. leverage chatGPT for pre-existing code that is undocumented. Specifically for generating API documentation that will be published for each crate on on docs.rs.
    (struct members, enum variants, method params & returns, etc). And I would ensure that each generated doc page has a disclaimer: docs generated by chatGPT, or anyway some way for readers to distinguish between chatGPT comments and dev comments.

  2. automatically prohibit any new code that is not “fully documented” – however we define that. This forces devs to document code as they write it.

(not necessarily in that order)

(2) of course requires a CI policy change by the maidsafe team. afaik, no such policy exists at present.

3 Likes

I want to help but don’t know where to start. Do we just copy code snippets to ChatGPT and ask it to comment on it? As far as I know ChatGPT is not connected to the internet so it won’t have access to latest code updates.

2 Likes

Yes you can paste code, I think it accepted way more before, been reduced now. Not sure about the numbers.

If you can use someone that has the time, retired (read old), and has the level of understanding that requires asking what an API is, then I’m your man and would be happy to help.

7 Likes

Whats needed is someone with the time and organisational skills to ensure
all the codebase is covered by the ChatGPT scripts that will be produced.

and @Winegrower sounds like the very chap :slight_smile:

check your chat please @Winegrower

I asked Chat GPT to


Please generate API documentation for the following code 
Please also extract every struct members, enum variants, method params & returns and provide commentary

and pasted in safe_network/keys.rs at main · maidsafe/safe_network · GitHub

Here is the response

This code defines an implementation of the `Safe` struct which has the following methods:

* `validate_sk_for_url`: This method takes a `&SecretKey` and a `&str` as input, and returns a `Result<String>`. It checks whether the XOR/NRS-URL corresponds to the public key derived from the provided client id.
* `new_keypair_with_pk_url`: This method returns a `Result<(Keypair, SafeUrl)>`. It generates a new random BLS keypair along with a URL for the public key.
* `serialize_bls_key`: This method takes a `&BlsSecretKey` and a `impl AsRef<Path>` as input, and returns a `Result<()>`. It serializes a SecretKey to hex in a file at a given path. If the path already exists it will be overwritten.
* `deserialize_bls_key`: This method takes a `impl AsRef<Path>` as input and returns a `Result<BlsSecretKey>`. It deserializes a Keypair from file at a given path.

It also defines a few structs and enums that are used in the above methods, they are:

* `Safe`: This struct is used in the implementation of the methods.
* `Error`: This enum represents the errors that can occur when using the methods.
* `Result`: This is a type alias for `Result<T, Error>`.
* `SafeUrl`: This struct is used to represent a Safe Network Resource Name (NRS) or XOR-URL.
* `Keypair`: This enum represents a keypair which can be either a BLS or ED25519.
* `SecretKey`: This enum represents a secret key which can be either a BLS or ED25519.
* `BlsSecretKey`: This struct represents a BLS Secret key.

It also uses `hex`, `std::path`, `xor_name`, `bls` and `sn_interface` crates.

This code also contains test cases for the above implemented methods.

@danda and everyone else

How would you rate the quality of this response?
Are we getting close to something fit for purpose?
I know there will be a lot of manual tidying up, cross-referencing etc required.

How would YOU write the ChatGPT query?

1 Like