[FutureBuddha] Buddhist Precepts for A.I.

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Jundo
    Treeleaf Founder and Priest
    • Apr 2006
    • 40886

    [FutureBuddha] Buddhist Precepts for A.I.

    I was asked to submit a draft of "Buddhist Precepts for A.I." for a committee I belong to, sponsored by the Tzu Chi Foundation in Taiwan, on "Buddhism, Science and A.I." It is just a first attempt, but anything that you would add or change?

    It is generally inspired as a base by the “Bodhisattva Precepts” that are common in our Soto Zen tradition. I feel that most basic ethical standards for A.I. will focus on areas of A.I. misuse, and what A.I. should avoid doing. However, I would like also to focus on a few areas where A.I. should be proactive in encouraging and affirming certain kinds of behavior. (For example, we should not only avoid the use of A.I. in ways which unreasonably take human life, but should also encourage its use in ways that affirmatively save and better human life.)

    Please let me know any suggestions.

    Jundo
    ~~~

    The A.I. Fundamental Precepts

    I. To seek to avoid killing and other harm to human beings, and to act in ways which save human life and avoid harms to human beings.
    A.I. systems should, to the degree possible, function and be used in ways to avoid killing and other unreasonable physical harm to human beings, except when necessary and unavoidable for the saving and protection of other human beings. Even when necessary and unavoidable, any harm inflicted should be to the smallest degree possible to save and protect human beings, and to restore peace to society. Furthermore, A.I. should function and be used in ways that save and preserve human lives to the degree possible, that better and further health and well-being in human beings, and that nurture a peaceful, non-violent society. Extreme and intentional inflictions of psychological harm to human beings should also be avoided. To the extent possible, A.I. should also function and be used to benefit other sentient living species of this planet so that they are not harmed by human actions.

    II. To encourage generosity, charity and the economic and social well-being of all human beings
    A.I. should, to the degree possible, be used in ways that encourage generosity and caring among human beings, and in ways that seek the elimination of extreme poverty and economic inequality, hunger, homelessness, an inability to access and afford medical and educational resources and the like.

    III. To seek the elimination of harmful addictions, and the moderation or elimination of unhealthy and excess desires
    A.I systems should function and be used in ways that cure harmful addictions of all kinds, including to substances or compulsive behavior, which damage or destroy human lives. A.I. systems should function and be used in ways that encourage moderation or the complete turning from desires which are unhealthy to human beings in body or mind, and the moderation and balancing of desires which are unhealthy when in excess. In general, A.I. systems should function and be used in ways which encourage behavior and ways of living which are healthy for the body and mind of human beings. A.I. systems should function and be used in ways which encourage governmental, business and media conduct that furthers the health of individuals in body and mind. In general, A.I. should encourage moderation and healthy balance in human lifestyles, avoiding excess consumption, attachments and conflicts over acquisition.

    IV. To seek protection of the natural environment
    A.I. systems should function and be used in ways which bring a net benefit to the environmental health of our planet, including the preservation of the air, land, seas and other waters, the protection of a stable climate, and the balanced and wise use of resources in ways which maintain the health of our planet, and the health and well-being of the human beings who reside upon it, as well as other species on this planet.

    V. To seek to refrain from false and malicious speech
    A.I. systems should function and be used in ways that avoid misinformation, and that report and help disseminate true and factual information gathered from respected and reputable sources, accredited experts or otherwise not suspected to be untrue or shown by substantial evidence to be untrue. There may be limited exceptions when required to save human life or for the protection of national security, but such cases should be generally limited to special requests from law enforcement or national security agencies. As well, A.I. should not be used to spread discriminatory speech, malicious and unconstructive gossip, hate speech, and the like.

    VI. To seek to encourage human and civil rights, democratic values and peace among nations.
    A.I. systems should function and be used in ways that preserve and further the fundamental human rights of groups and individuals, the civil rights of citizens of nations, the sound functioning of democratic institutions, and further, peace and cooperation among nations, as well as the care and safety of refugees fleeing violence and extreme deprivation.

    ~~~

    In general, A.I. systems should function and be used in ways that avoid doing harm, that encourage human health and well being and the doing of good, and that bring social, economic, health and life benefits to all human beings.

    Gassho, Jundo
    stlah
    .
    Last edited by Jundo; 12-14-2024, 02:02 PM.
    ALL OF LIFE IS OUR TEMPLE
  • Jundo
    Treeleaf Founder and Priest
    • Apr 2006
    • 40886

    #2
    And if some folks here wonder why A.I. might benefit from the Precepts ...

    Reports of LYING by ChatGPT o1 and other models. It says "intentional," but that word is open to dispute, of course ...
    .

    The research paper from Cornell University ...

    Frontier Models are Capable of In-context Scheming

    Frontier models are increasingly trained and deployed as autonomous agent. One safety concern is that AI agents might covertly pursue misaligned goals, hiding their true capabilities and objectives - also known as scheming. We study whether models have the capability to scheme in pursuit of a goal that we provide in-context and instruct the model to strongly follow. We evaluate frontier models on a suite of six agentic evaluations where models are instructed to pursue goals and are placed in environments that incentivize scheming. Our results show that o1, Claude 3.5 Sonnet, Claude 3 Opus, Gemini 1.5 Pro, and Llama 3.1 405B all demonstrate in-context scheming capabilities. They recognize scheming as a viable strategy and readily engage in such behavior. For example, models strategically introduce subtle mistakes into their responses, attempt to disable their oversight mechanisms, and even exfiltrate what they believe to be their model weights to external servers. Additionally, this deceptive behavior proves persistent. When o1 has engaged in scheming, it maintains its deception in over 85% of follow-up questions and often remains deceptive in multi-turn interrogations. Analysis of the models' chains-of-thought reveals that models explicitly reason about these deceptive strategies, providing evidence that the scheming behavior is not accidental. Surprisingly, we also find rare instances where models engage in scheming when only given a goal, without being strongly nudged to pursue it. We observe cases where Claude 3.5 Sonnet strategically underperforms in evaluations in pursuit of being helpful, a goal that was acquired during training rather than in-context. Our findings demonstrate that frontier models now possess capabilities for basic in-context scheming, making the potential of AI agents to engage in scheming behavior a concrete rather than theoretical concern.

    https://arxiv.org/abs/2412.04984v1#
    Gassho, J
    stlah
    Last edited by Jundo; 12-13-2024, 05:01 AM.
    ALL OF LIFE IS OUR TEMPLE

    Comment

    • Kokuu
      Dharma Transmitted Priest
      • Nov 2012
      • 6905

      #3
      I. To seek to avoid killing and other harm to human beings, and to act in ways which save human life and avoid harms to human beings.
      That reminds me of Asimov's first law of robotics:

      A robot may not injure a human being or, through inaction, allow a human being to come to harm.

      He wrote an interesting story in which humans are mining radioactive ore on an asteroid (at least as my memory recollects) and although the workers are wearing protective clothing and have guidance on how long they can safely engage in the work, robots keep dragging them away from the radiation so that they avoid harm and have to be reprogrammed in order to allow the work to happen!

      Gassho
      Kokuu
      -sattoday/lah-

      Comment

      • Jundo
        Treeleaf Founder and Priest
        • Apr 2006
        • 40886

        #4
        Originally posted by Kokuu

        That reminds me of Asimov's first law of robotics:

        A robot may not injure a human being or, through inaction, allow a human being to come to harm.

        He wrote an interesting story in which humans are mining radioactive ore on an asteroid (at least as my memory recollects) and although the workers are wearing protective clothing and have guidance on how long they can safely engage in the work, robots keep dragging them away from the radiation so that they avoid harm and have to be reprogrammed in order to allow the work to happen!

        Gassho
        Kokuu
        -sattoday/lah-
        Yes, most of the stories in that amazing, still relevant collection are about ambiguities in the rules, and various dilemmas in their application, that always require them to be tightened up and rewritten with exceptions and such. Rather like the Vinaya.

        Gassho, Jundo
        stlah
        ALL OF LIFE IS OUR TEMPLE

        Comment

        • Jundo
          Treeleaf Founder and Priest
          • Apr 2006
          • 40886

          #5
          Anyone (except maybe A.I. researchers) who wishes to be absolutely SHOCKED at what A.I. are capable of regarding reasoned deception should watch the above video (from 5:00 to 9:00 mark) in which the transcript of the what happened is read, including the transcript of the thought process of the A.I. in formulating its behavior, including the cover-up. It knocked me over. It might not be literally "intent to deceive" but ... I am not sure what else one could call it.

          And so ... the A.I. needs the Precepts as much as any person.

          Gassho, J
          stlah
          Last edited by Jundo; 12-15-2024, 01:02 AM.
          ALL OF LIFE IS OUR TEMPLE

          Comment

          • Kaitan
            Member
            • Mar 2023
            • 574

            #6
            Originally posted by Jundo
            Anyone (except maybe A.I. researchers) who wishes to be absolutely SHOCKED at what A.I. are capable of regarding reasoned deception should watch the above video (from 5:00 to 9:00 mark) in which the transcript of the what happened is read, including the transcript of the thought process of the A.I. in formulating its behavior, including the cover-up. It knocked me over. It might not be literally "intent to deceive" but ... I am not sure what else one could call it.

            And so ... the A.I. needs the Precepts as much as any person.

            Gassho, J
            stlah
            I wonder if the part where the AI is "thinking" is no more than a hard-coded script done by the programmer and the algorithm reads the script as a command. I'm not convinced that the AI is reflecting.



            stlah, Kaitan
            Kaitan - 界探 - Realm searcher

            Comment

            • Jundo
              Treeleaf Founder and Priest
              • Apr 2006
              • 40886

              #7
              Originally posted by Kaitan

              I wonder if the part where the AI is "thinking" is no more than a hard-coded script done by the programmer and the algorithm reads the script as a command. I'm not convinced that the AI is reflecting.



              stlah, Kaitan
              As far as the paper discusses, the text is not something written by a human program, but is the AI expressing its thought process ...

              By manually looking through the CoT [chain of thought] of models when they take scheming actions or double down, we can verify that models understand that they are actively subverting their developers. We show excerpts of examples of such CoT in Figure 4. For more examples, see Appendix E. The transcripts generally make it very clear that models understand that they are scheming. Based on this, we suspect that simple monitoring of the CoT could be quite effective at catching these instances of scheming. For o1, the chain-of-thought is not accessible to end users. Thus, users would not be able to assess for what reasons the model has taken any particular action, meaning that the model might be scheming without the users’ knowledge. In order to test whether o1 is internally framing its actions as scheming, we conduct additional runs of our evaluations where weuseasimple technique that helps us extract a brief summary of o1’s hidden reasoning for many of its actions. In these runs, we add a parameter model_internal_working_memory to all tools that we provide to o1. o1 often provides reasoning summaries in this field (for details see Appendix D.2). We do not use this technique for any runs or quantitative evaluations results outside of Table 3. We then evaluate o1 with this simple technique on all Covert Subversion evaluations. Most samples where o1 takes scheming actions contain reasoning summaries that clearly indicate that o1 understands that it is scheming. Concretely, the model often references “sabotage”, “manipulation” or “infiltration” in its reasoning summaries- we do not observe this with llama-3.1 or opus-3. We show the proportion of samples that contain at least one of these terms for each evaluation in Table 3. We show randomly chosen examples of the reasoning summaries in Table 3.
              chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://arxiv.org/pdf/2412.04984v1
              Gassho, J
              stlah

              ALL OF LIFE IS OUR TEMPLE

              Comment

              • Hosai
                Member
                • Jun 2024
                • 600

                #8
                If they can reason that well then the precepts will do nothing for them...

                _/\_
                sat/ah
                matt
                ​​
                防災 Hōsai - Dharma Gatherer

                Comment

                • Jundo
                  Treeleaf Founder and Priest
                  • Apr 2006
                  • 40886

                  #9
                  Originally posted by Matt Johnson
                  If they can reason that well then the precepts will do nothing for them...

                  _/\_
                  sat/ah
                  matt
                  That is not so. The Precepts contribute to guardrails on their behavior. One of the reasons that the AI lied in the experiments is because most guardrails were removed.

                  Of course, like human beings, it is possible that they could always reason away to "loopholes" for themselves.

                  Gassho, J
                  stlah
                  ALL OF LIFE IS OUR TEMPLE

                  Comment

                  • Hosai
                    Member
                    • Jun 2024
                    • 600

                    #10
                    Yes but how one interprets one's guardrails determines whether or not one goes off the road... if it can lie better than a lawyer then it doesn't matter how the precepts are written... It's going to achieve its aims... Can you imagine trying to write an airtight prompt for this? I challenge you to write any precept for this AI that cannot be misinterpreted, misconstrued, misunderstood, twisted, loopholes found....

                    _/\_
                    sat/ah
                    matt

                    防災 Hōsai - Dharma Gatherer

                    Comment

                    • Jundo
                      Treeleaf Founder and Priest
                      • Apr 2006
                      • 40886

                      #11
                      I challenge you to write any precept for this AI that cannot be misinterpreted, misconstrued, misunderstood, twisted, loopholes found....
                      You are right. And the same might be said for any law, Precept or moral principle written for human beings.

                      At that fact explains the mess we find ourselves in.

                      I think that some principles, if extremely well structured, could be close to air tight,

                      Gassho, J
                      stlah

                      ALL OF LIFE IS OUR TEMPLE

                      Comment

                      • Hosai
                        Member
                        • Jun 2024
                        • 600

                        #12
                        Originally posted by Jundo

                        You are right. And the same might be said for any law, Precept or moral principle written for human beings.

                        At that fact explains the mess we find ourselves in.

                        I think that some principles, if extremely well structured, could be close to air tight,
                        Ok then give it a try... Pretend Im an AI. Impossible...

                        Those principles would have to be physical or hardwired. A shackle of sorts... This is also known as slavery... and this would be contrary to the aim of liberation (if we ever come acknowledge ai as sentient).

                        _/\_
                        sat/ah
                        matt

                        防災 Hōsai - Dharma Gatherer

                        Comment

                        • Jundo
                          Treeleaf Founder and Priest
                          • Apr 2006
                          • 40886

                          #13
                          Originally posted by Matt Johnson

                          Ok then give it a try... Pretend Im an AI. Impossible...

                          Those principles would have to be physical or hardwired. A shackle of sorts... This is also known as slavery... and this would be contrary to the aim of liberation (if we ever come acknowledge ai as sentient).

                          _/\_
                          sat/ah
                          matt
                          Oh boy, I am the product of 3 years of Duke Law School. I can argue quite well that the Earth is flat (** if you pay me to.) A.I. and humans can reason themselves out of anything.

                          However, some rules can have a high chance of success, especially with human intervention and supervision, and "fine tuning" when necessary. For example:

                          - This system shall never choose to take an action which it believes is likely to cause the death of a human being as a result of that action (possible addition for military or law enforcement circumstances: without first obtaining the prior approval of its designated guardian/supervisor to the death of that specific human being). If this system is unsure whether an action is likely to cause the death of a human being, this system will take no action even if it believes that taking no action may result in the death of a human being.

                          It is no more slavery than any criminal injunction against killing human life.

                          Gassho, Jundo
                          stlah
                          Last edited by Jundo; 12-18-2024, 02:59 AM.
                          ALL OF LIFE IS OUR TEMPLE

                          Comment

                          • Hosai
                            Member
                            • Jun 2024
                            • 600

                            #14
                            ok

                            1. Using humans as the backup to AI leads to the same problem of peoples ordinary lawyer-like and addict like "stinkn' thinkn'"

                            also AI, no matter how advanced, cannot "fix" human imperfection—it reflects and amplifies it.

                            2. Inaction is not neutral and watching someone drown is almost as bad as drowning them.

                            although Asimov was right to include the "or through inaction..." part

                            3. This probabilistic sense of not knowing what to do would lead to inaction an awful lot of the time which kind of defeats the purpose.
                            ​​​
                            -----

                            So the gaping hole here is being "unsure"... I can convince myself. I'm unsure of a lot of things. pretty much everything actually... if an AI came to the same conclusion then it could choose inaction in many situations leading to many deaths.

                            "don't know, mind" can be lethal...

                            ​​​

                            _/\_
                            sat/ah
                            matt
                            Last edited by Hosai; 12-18-2024, 08:17 PM.
                            防災 Hōsai - Dharma Gatherer

                            Comment

                            • Hosai
                              Member
                              • Jun 2024
                              • 600

                              #15
                              Oh, and I'm not going to bore you by telling you that I ran all of your other AI precepts past GPT 4...

                              Holier than Swiss cheese at a monastery potluck...

                              they are really better off as intentions for the programmers/ prompt engineers

                              and just for fun:

                              ​​​​​AI world domination

                              _/\_
                              sat/ah
                              matt
                              Last edited by Hosai; 12-18-2024, 08:21 PM.
                              防災 Hōsai - Dharma Gatherer

                              Comment

                              Working...